!pip install numpy # install package "numpy" from inside IDE
import numpy as np
0,1,size=10) np.random.normal(
Basics of Python for Data Science
PhD Course in Psychological Sciences - University of Padova
About This Course
This is a short (10 hours) introductory course on Python that is offered within the PhD program in Psychological Sciences (University of Padova). R is more extensively used for statistical analysis and data science in this PhD program, but familiarity with Python is useful for different purposes, including advanced machine learning applications, deep learning models, natural language processing, computational efficiency in some scenarios, and programming experiments. Also, Python is widely required in industry and business, so basic proficiency with it a valuable asset! No prerequisites, but having attended Basics of R for Data Science may greatly help (well, it is a mandatory course for our PhD students)
You are encouraged to look at this online book available on GitHub: Python Data Science Handbook
Dates and Rooms 2025
Day | Time | Room |
---|---|---|
Tuesday, June 3rd | 09:00-11:30 | 4R |
Wednesday, June 4th | 13:00-15:30 | 4R |
Thursday, June 5th | 09:00-11:30 | 4S |
Friday, June 6th | 09:00-11:30 | 4S |
Getting Started
Bookmark this course homepage to have a quick access to this material (https://enricotoffalini.github.io/Basics-Python/). Then do this:
- Install Python: go to the official Python download page and follow the installation instructions for your operating system.
- Install an IDE (Integrated Development Environment): Suggested IDEs are Spyder or Posit-RStudio (which you should have already installed for the other courses; it also supports Python!).
- Test your local setup: Make sure that your Python installation works; open your IDE of choice and run the following code in the console:
If you get any errors when running the first line, try install the package via terminal with
pip install numpy
- Take a look at Colab: Some basic practice and exercises will be conducted on Google Colab (you need to log in with a Google account), a free online environment for writing and running Python in the browser without any local installation.
Course Topics
Getting Started with Python: Environment, Syntax, Tools
An introduction to Python and its ecosystem: setting up Python locally or in the cloud (Google Colab), using IDEs, understanding basic syntax and operations, creating and naming variables, using packages and functions, and working with core data structures (lists, tuples, dictionaries) and indexing.
Basics of Programming in Python
A hands-on to basic programming concepts such as conditional logic, loops, and write and use custom functions. These are core skills for writing flexible and efficient Python code.
Entering the World of Data Science in Python: pandas
, numpy
, and more
Explore the Python core libraries for data science. We will learn how to manipulate and analyze tabular data with pandas
, handle arrays and numerical operations with numpy
, and get a taste of statistical modeling and basic machine learning using statsmodels
and scikit-learn
.
A Bit on Fancy Topics?
Depending on interest, we may explore more advanced topics such as data visualization, basic machine learning, and use of deep learning and language models available via HuggingFace 🤗, or even simple experiment programming. After all, that’s truly why we want to use Python.
Materials
Slides
- Intro to This Course, aka Why We Should Use Python
- Getting Started: IDEs, Operations, Basic Types of Data
- Virtual Environments, Packages, Import/Export
- Basic Syntax, Typing, Indexing, Differences with R
- Basics of Data Science: Intro to
NumPy
andPandas
- Programming: Conditionals, Loops, Custom Functions
Exercises
— The following exercises are fundamental, and they importantly integrate concepts from the slides and introduce new functions and methods that you want to know!
- First steps in Python (via Colab)
- First steps with
NumPy
andPandas
(via Colab) - Basics of Programming… with some data science
- Programming like a data scientist
- The data nightmare exercise… in Python
— These other exercises are beyond the scope of this introductory course, but they could be stimulating and useful as simple tutorials for some users:
- Basics of Clustering with K-Means and GMM
additional exercise on Simulating clusters and assessing inferential risks
- Basics of Sentiment Analysis with HuggingFace Transformers
- Basics of Text Embeddings (plus PCA and Clustering)
- Basics of Text Classification using Embeddings and Cosine Similarity
additional exercise on Text embeddings to automatically evaluate construct validity
- Other Examples of Language, Speech, and Image Processing
additional mini-tutorial on AI/LLM as research assistants in systematic reviews
GitHub repository associated to the present course website: https://github.com/EnricoToffalini/Basics-Python
Access padlet
Many thanks to Filippo Gambarota for sharing his expertise with using GitHub and Quarto, to Margherita Calderan for her valuable assistance with programming experiments, and Tommaso Feraco for a fruitful collaboration on the use and interpretation of semantic embeddings
Read a modest proposal for open source software in the PhD program in Psychological Sciences