This project involved using the CRISP-DM (Cross-Industry Standard Process for Data Mining) framework to analyze data on startup growth, funding patterns, success rates, regional concentration, and market penetration of the Indian Startup Ecosystem between 2018 and 2021.
The objective, as a data analyst, was to draw insights on the available data from 4 tables and give actionable insights and recommendations to our team that, hypothetically speaking, wants to venture into the Indian startup market. The project utilized advanced statistical techniques and visualization tools to draw meaningful conclusions and present our findings effectively.
Code | Jupyter Notebook | Published Article | PowerBi Dashboard |
---|---|---|---|
LP1 | Notebook | Medium Article | PowerBI Dashboard |
- Project Links
- Dataset
- Some Tools Used For The Project
- Dataset
- Process
- Project Structure
- Key Insights
- Dashboard
- How To Use This Repository
- Author
-
Company/Brand: Name of the company/start-up
-
Sector: Sector of service
-
What it does: Description about Company
-
Investor: Investors
-
Amount($): Raised fund
-
Stage: Round of funding reached
-
Year: The year of funding
-
Pull data from remote database with pyodbc; save as csv files
-
Develop questions and a hypothesis to base analysis of the project on
-
Understand the data and make decisions on how to process the data
-
Data preprocessing, cleaning and merging- The data was very messy and 90% of the project involved cleaning and making the data ready for analysis and visualizations
-
Visualise the data with seaborn and matplotlib.pyplot
-
Created a PowerBI dashboard with the visualizations
-
Wrote a medium article and briefly described the process, findings and recommendations
Dataset/
: Contains the dataset used for analysis..gitignore
: Holds files to be ignored by Git.LICENSE
: Project license.Project_notebook.ipynb
: The jupyter notebook with data cleaning, EDA and visualizationsREADME.md
: Project overview, links, highlights, and information.requirements.txt
: Required libraries & packages
- India's startup ecosystem is positive and thriving with strong favor towards technological companies
- Equal opportunities exist for non-technological sectors as well
- Fintech and retail sectors are particularly active in the Indian startup ecosystem
- Mumbai is at the forefront of these developments in fintech and retail sectors
- Non-technical industries have a few large investments that skew the mean higher
- Technical industries have a larger number of smaller investments, driving up the sum.
The Indian startup ecosystem is a vibrant and influential force in the global market. This project provides valuable insights into funding patterns and industry preferences. By leveraging this information, stakeholders can make informed decisions and contribute to the growth and success of startups in India.
You need to have Python 3
on your system. Then you can clone this repo and being at the repo's root :: repository_name> ...
- Clone this repository:
git clone https://github.com/Azie88/LP1-Data-Analysis.git
- On your IDE, create A Virtual Environment and Install the required packages for the project:
-
Windows:
python -m venv venv; venv\Scripts\activate; python -m pip install -q --upgrade pip; python -m pip install -qr requirements.txt
-
Linux & MacOs:
python3 -m venv venv; source venv/bin/activate; python -m pip install -q --upgrade pip; python -m pip install -qr requirements.txt
The two long command-lines have the same structure. They pipe multiple commands using the symbol ;
but you can manually execute them one after the other.
- Create the Python's virtual environment that isolates the required libraries of the project to avoid conflicts;
- Activate the Python's virtual environment so that the Python kernel & libraries will be those of the isolated environment;
- Upgrade Pip, the installed libraries/packages manager to have the up-to-date version that will work correctly;
- Install the required libraries/packages listed in the
requirements.txt
file so that they can be imported into the python script and notebook without any issue.
NB: For MacOs users, please install Xcode
if you have an issue.
- Explore the Jupyter notebook for detailed steps and code execution.
- Check out the Power BI dashboard for interactive visualizations.
- Read the published article for a comprehensive understanding of the project.
Andrew Obando
Connect with me on LinkedIn
Follow me on Medium
Feel free to star ⭐ this repository if you find it helpful!