Welcome to my GitHub profile! As an incoming graduate student, Iβm passionate about diving into the world of data to uncover insights and drive impactful decisions. My interests span business analytics, data analysis, business intelligence, data science and project management.
- π Data Analysis: Discovering trends and patterns to enhance business strategies and learning some of the latest tools now used by data analysts such as knowledge about Hex, Apache Superset and DataBricks.
- π Business Intelligence: Designing dashboards and reports to visualize data effectively and also learning some of the unique Bi tools such as Qlik Sense, SAP and Looker
- 𧩠Problem Solving: Applying data-driven methods to tackle complex challenges found in machine learning anamolies detection, Simulation of What-If Analysis and Root Cause Analysis (RCA): A methodical approach used to identify the underlying cause of a problem rather than addressing the superficial symptoms
- π» Data Science: Leveraging machine learning and statistical techniques to solve real-world problems expanding my interest to learn more about Edge Computing for Real-time Analytics: Processing data near the source to decrease latency and enhance the speed of data analysis.
-
π Advanced Machine Learning:
- Exploring Deep Learning and Neural Networks: Delving into more complex structures such as Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) to handle image and sequential data, respectively.
- Reinforcement Learning: Studying how agents operate in an environment to maximize the notion of cumulative reward.
- Explainable AI (XAI): Focusing on techniques that make the outputs of machine learning models more understandable to humans.
-
π Data Engineering:
- Understanding Data Pipelines and ETL Processes: Learning to design robust, scalable data pipelines using modern tools like Apache Airflow and Prefect to automate workflows.
- Data Governance and Quality: Ensuring that data is accurate, consistent, and used responsibly, especially in compliance-heavy industries.
- Streaming Data Processing: Working with real-time data streams using Apache Kafka and Apache Flink to handle continuous inputs of data efficiently.
-
π Big Data Technologies:
- Gaining Insights into Hadoop and Spark: Enhancing skills in managing and processing big data using the Hadoop ecosystem, including HDFS, YARN, and MapReduce, along with Apache Spark for faster in-memory processing capabilities.
- NoSQL Databases: Exploring NoSQL technology for scalability and performance improvements over traditional relational databases, with focus on MongoDB, Cassandra, and Neo4j.
- Cloud Big Data Solutions: Learning about cloud-native solutions like Amazon Redshift, Google BigQuery, and Microsoft Azure HDInsight which provide massive scalability and performance on-demand.
- Federated Learning: A machine learning setting where the goal is to train a high-quality centralized model with training data distributed over a large number of clients.
- Automated Machine Learning (AutoML): Studying tools and techniques that automate the process of selecting the best models, configurations, and pre-processing techniques for specific data science tasks.
- Ethics and AI: Understanding the ethical implications of AI and machine learning, focusing on developing and implementing algorithms responsibly to prevent bias and ensure fairness.
Thank you for visiting my profile! Feel free to explore my repositories to see how Iβm applying my skills to real-world challenges. π