3.1 Common Tools and Terms

In this section, there will be several common tools as well as terms for you to read through. You can click the graph after the installation instruction to install the tools you may need.

3.1.1 Anaconda

Anaconda is an integrated platform. You can have a jump start of your trip of analytics with Anaconda. It offers the easiest way to perform Python/R data science and machine learning on a single machine with rich packages. Start working with thousands of open-source packages and libraries today!

Related terms: 3.1.2, 3.1.3

Installation:

Recommend

3.1.2 Python

Python is a widely used general-purpose, high level programming language. It was created by Guido van Rossum in 1991 and further developed by the Python Software Foundation. It was designed with an emphasis on code readability, and its syntax allows programmers to express their concepts in fewer lines of code.

Easy Installation:

Recommend

Official Installation:

py

Free Online Python Tutorial

3.1.3 R

R is a language and environment for statistical computing and graphics. R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, …) and graphical techniques, and is highly extensible. The S language is often the vehicle of choice for research in statistical methodology, and R provides an Open Source route to participation in that activity.

Easy Installation:

  • Step 1:

Recommend

  • Step 2: Open Anaconda Prompt

  • Step 3: Type “conda install r”

Official Installation:

R

Online R Programming Tutorial

3.1.4 SQL (MySQL)

SQL, Structured Query Language) is a domain-specific language used in programming and designed for managing data held in a relational database management system (RDBMS), or for stream processing in a relational data stream management system (RDSMS). It is particularly useful in handling structured data, i.e. data incorporating relations among entities and variables.

MySQL MySQL is an open-source relational database management system (RDBMS). Its name is a combination of “My”, the name of co-founder Michael Widenius’s daughter My, and “SQL”, the abbreviation for Structured Query Language. A relational database organizes data into one or more data tables in which data may be related to each other; these relations help structure the data.

Installation:

SQL

Free Online SQL Tutorial

3.1.5 Microsoft Excel

Microsoft Excel is a spreadsheet developed by Microsoft for Windows, macOS, Android and iOS. It features calculation or computation capabilities, graphing tools, pivot tables, and a macro programming language called Visual Basic for Applications (VBA). Excel forms part of the Microsoft Office suite of software.

Key terms you may encounter:

Free Download for students

3.1.6 Data Visualization

Data visualization is the practice of translating information into a visual context, such as a map or graph, to make data easier for the human brain to understand and pull insights from. The main goal of data visualization is to make it easier to identify patterns, trends and outliers in large data sets.

Useful data visualization tools: