For most data analysis projects, your goal is going to be to create the highest quality analysis in the least amount of time.
If you understand the underlying concepts behind what you’re doing, then you can use either language to perform your analysis.
For
example, if you understand the principles of natural language
processing, data cleaning, and machine learning, you can implement an
automated text summarizer in R or Python.
As time goes on, data analysis in R and Python is becoming more similar as great packages like pandas, rvest, and ggplot bring concepts from one language into the other.
Given
that, for most cases, I would use whatever language you’re most
familiar with. Here are some main points of differentiation between the
languages to be aware of, though:
R has a much bigger library of statistical packages
If
you’re doing specialized statistical work, R packages cover more
techniques. You can find R packages for a wide variety of statistical
tasks using the CRAN task view. R packages cover everything from Psychometrics to Genetics to Finance. Although Python, through SciPy and packages like statsmodels, covers the most common techniques, R is far ahead.
Python is better for building analytics tools
R
and Python are equally good if you want to find outliers in a dataset,
but if you want to create a web service to enable other people to upload
datasets and find outliers, Python is better. Python is a general
purpose programming language, which means that people have built modules
to create websites, interact with a variety of databases, and manage
users.
In general, if you want to build a tool or service that uses data analysis, Python is a better choice.
R builds in data analysis functionality by default, whereas Python relies on packages
Because Python is a general purpose language, most data analysis functionality is available through packages like NumPy and pandas.
However, R was built with statistics and data analysis in mind, so many
tools that have been added to Python through packages are built into
base R.
Python is better for deep learning
Through packages like Lasagne, caffe, keras, and tensorflow, creating deep neural networks is straightforward in Python. Although some of these, like tensorflow, are being ported to R, support is still far better in Python.
Python relies on a few main packages, whereas R has hundreds
In Python, sklearn is the “primary” machine learning package, and pandas
is the “primary” data analysis package. This makes it easy to know how
to accomplish a task, but also means that a lot of specialized
techniques aren’t possible.
R, on the other
hand, has hundreds of packages and ways to accomplish things. Although
there’s generally an accepted way to accomplish things, the lines
between base R, packages, and the tidyverse can be fuzzy for inexperienced folks.
R is better for data visualization
Packages like ggplot2
make plotting easier and more customizable in R than in Python. Python
is catching up, particularly in the area of interactive plots with
packages like Bokeh, but has a way to go.
The bottom line
Performing
data analysis tasks in either language is more similar than you might
expect. As long as you understand the underlying concepts, pick the
language that you’re most familiar with.
R has an edge in statistics and visualization, whereas Python has an advantage in machine learning and building tools.
If
you’re new to data analysis, I’d advise learning Python, because it’s
more straightforward and more versatile, but I’d also advise focusing on
the concepts and quality of the analysis over language. At Dataquest, we teach data science by focusing on the concepts and helping you build projects and add value.
I personally like where tools like Jupyter Notebook and Beaker Notebook are headed in terms of letting you use either language, sometimes in the same analysis.
source: quora
source: quora
python is always best
ReplyDeleteThanks for sharing this blog. A great information you shared through this blog. Keep it up and best of luck for your future blogs and posts.
ReplyDeleteRead my Blogs: Basic Java GUI Programming: Bringing User-Friendly Interfaces to Life