|Client:||Max-Planck-Institute and Helmholtz-Institute|
|Task:||Web Scraping and Data Wrangling of spatial datasets with Python, development of an (end-to-end) ML app and dashboards for training/validation of datasets with Python|
|Time period:||10/2022 to 12/2022|
|Technologies:||Python, Requests, Pandas, Geopandas, Numpy, Sklearn, Matplotlib, Seaborn, Streamlit, Plotly, Ray, Jupyter Lab, VS Code, Insomnia, Gitlab, AWS|
For a Max-Planck-Institute and Helmholtz-Institute, I developed a geospatial dataset using web scraping and data wrangling techniques with Python, which refers to different landscape influencing factors from the Holocene geological era. Based on this dataset, ML algorithms can be validated.
Furthermore, with our team members, I developed a streamlit app with Python, which supports the creation of scientific hypotheses with machine data analysis based on user input data and visual processing of the results. In this app, the user can feed in his datasets, select his process environment for the analysis (macOS/Ubuntu) and choose between different ML methods with the associated hyperparameters for the data analysis. These results are presented visually using plotting methods so that the user can consolidate his scientific questions and hypotheses.
This ML project has been continued as an open-source project since 2023.