An enthusiastic data engineer with great interest in applied ML to solve new, emerging problems. I often work on various, independent projects which requires me to be able to quickly learn new technologies and apply them to real world problems. In current my role I develop, maintain and improve streaming data pipelines for IoT solutions, and further analyse the outputs to get an insight into the data collected. Furthermore, I usually participate in various ad-hoc side projects where data science and ML skills are essential to tackle the problems in hand.
my journey in a nutshell

Technical skills

Programming

PROFICIENT
  • Python
  • SQL
  • KQL (Kusto)
  • RegEx
  • Matlab
  • R
  • LaTeX
FAMILIAR WITH
  • NoSQL
  • HTML
  • jQuery
  • CSS

Dev Tools

PROFICIENT
  • PyTorch
  • Numpy
  • Pandas
  • Databricks, PySpark
  • Azure: IoT Hub, Logic Apps, ADX
  • spaCy
  • Seaborn
  • Matplotlib
  • PowerBI backend
  • Web scraping, Selenium
  • Azure cloud services
FAMILIAR WITH
  • TensorFlow
  • Azure Synapse Analytics
  • Keras
  • Scikit-learn
  • openCV
  • SciPy
  • Tableau, PBI
  • NLTK
  • SSIS, SSMS

Concepts

PROFICIENT
  • Neural Networks
  • Deep learning
  • IoT
  • ETL
  • Cloud Computing
  • Big Data
  • Convolutional NN
  • Semantic Segmentation
  • Audio source separation
  • Natural Language Processing
FAMILIAR WITH
  • Meta-learning
  • Sequence models
  • Support Vector Machines
  • Generative Adversarial Networks
  • Shallow learning
  • Confidence Interval
  • Principal Component Analysis

Programming

Dev Tools

Concepts

proficient
  • Python
  • SQL
  • KQL (Kusto)
  • RegEx
  • Matlab
  • R
  • LaTeX
  • PyTorch
  • Numpy
  • Pandas
  • Databricks, PySpark
  • Azure: IoT Hub, Logic Apps, ADX
  • spaCy
  • Seaborn
  • Matplotlib
  • PowerBI backend
  • Web scraping, Selenium
  • Azure cloud services
  • Neural Networks
  • Deep learning
  • IoT
  • ETL
  • Cloud Computing
  • Big Data
  • Convolutional NN
  • Semantic Segmentation
  • Audio source separation
  • Natural Language Processing
familiar with
  • NoSQL
  • HTML
  • jQuery
  • CSS
  • TensorFlow
  • Azure Synapse Analytics
  • Keras
  • Scikit-learn
  • openCV
  • SciPy
  • Tableau, PBI
  • NLTK
  • SSIS, SSMS
  • Meta-learning
  • Sequence models
  • Support Vector Machines
  • Generative Adversarial Networks
  • Shallow learning
  • Confidence Interval
  • Principal Component Analysis

Technical Experience

Work

IoT Data Engineer
Oct 2022 – present
IFUA Horváth
  • Plan, create, manage and maintain end-to-end IoT data pipelines in Azure using IoT Hub, Azure Data Explorer, DataLakes, Logic Apps, Synapse
  • Analyse historical data from IoT devices to identify outliers, correlations and energysaving opportunities for clients in Databricks using Python and PySpark
  • Develop ETL processes (SSIS, SSMS, SQL) and backend data models in PBI
  • Developed analytical tools in Python for domain experts
  • Regularly research novel IoT techniques and advances
  • Data Engineer (freelancing)
    2022, 6 months
    Neuron Solutions
  • Carried out image based fault detection model inferences, EDA, and performance analysis and evaluation to improve the ML model
  • Using Pandas, SpaCy, NER and regex extracted information from medical text data to prepare it for ML training purposes to predict lung tumor stages
  • IoT developer (Junior)
    Dec 2021 – Oct 2022
    IFUA Horváth
    Being a developer in the IoT (Internet of Things) section within the Enterprise Analytics group, I am participating in a range of projects involving data science, cloud computing, big data, live data and machine learning, as well as carrying out research on novel ML and IoT techniques and advances.
    NLP engineer
    Sep, 2021
    Hackerintro
    Using spaCy we built a rule-based model and trained a custom NER statistical model to detect the name of individuals along with their IT related skills in CVs / resumes.
    Data Science Intern
    Aug – Sep, 2017
    IFUA Horváth
    My task in the Data Science team was to identify numerous customer behaviours based on a large amount of customer data from an oil and gas company using k-means clustering and various data analysis and ML techniques in Python.
    Engineer Intern
    Aug, 2015
    AVM Konferenciatechnika Kft.
    During my time at this company I learnt to solder different kinds of cables and to assemble and install audiovisual systems e.g. into car salons or bank branches.

    Academic coursework

      Mathematics
    • Statistical Concepts
    • Statistical Methods
    • Probability
    • Calculus
    • Linear Algebra
    • Stochastic Processes
    • Mathematical Modelling assignments
      reports
      code.zip
    • Mathematical Finance

    Education & Certifications

    Academic

    MSci 2.1, Durham university, UK
    2015 – 2020
    Master in Science in Natural Sciences
    Computer Science and Mathematics
    Class II Division 1 with Honours
    Milestone Institute
    2014-2015

    The institute gathers talented hungarians from secondary schools and prepares them to study at the top universities of the world, introducing them to foreign university systems (e.g. UK), provides extracurricular classes beyond those taught in secondary schools, mentorship, student clubs, and opportunities to grow both personally and professionally before, during, and after attending university.

    Achievements & Conferences

    Guest lecturer: Introduction to AI and ML + use cases
    2022, Corvinus University, Business IT club
    • We held a lecture to university students about machine learning and its applications, explaining the difference between AI, ML, and deep learning, the different concepts regarding them, and how they are structured. Moreover, we even touched upon the different algorithms behind machine and deep learning, and the fields they are applied in. To sum up, we presented some real life examples that solidified everything taught.
    Presentation of our IoT solutions
    2022, Budapest, IoT Live Show by Yettel
    • Presenting our own IoT solutions and possible ideas for future projects plus use cases in various industries (e.g. retail), for example via a live demo and camera to track eye gazing point and describe the physical traits of the "customers".
    Data+AI Summit (by Databricks)
    2022, online
    • Data Analysis with Databricks SQL training course
    Data Innovation Summit
    2022, online
    Reinforce Conference
    2022, Budapest

    Extracurricular activities
    & Interpersonal skills

    Founder and President of Latin Social Dance Society
    2017 – 2019, Durham University
    • Leadership
    • Teamwork
    • Collaboration skills
    • Organisational skills
    • Time management
    • Public speaking
    • Management skills
    • Creative & critical thinking
    Latin dance instructor
    2016 – present
    • Public speaking
    • Effective communication
    • Teaching and explanation skills
    European Solidarity Corps volunteer
    2020 – 2021, Asociación Las Niñas del Tul, Granada
    • We organised and ran Erasmus+ projects for young people locally, nationally and internationally across Europe to bring cultures and knowledge together, furthermore to educate the youth using non-formal and informal methods. Such projects were youth exchanges, training courses, seminars, transnational meetings, etc...
    Active Societies Representative
    2018 – 2019, Durham Students' Union
    • making sure active societies' voice was heard and considered by the Students' Union
    • organising occasional gatherings and events for active societies
    Treasurer and Social Secretary
    2016 – 2017, Collingwood College Basketball Club, Durham University
    • taking care of the club's finance and managing its spendings
    • organising social events for club members
    Hotel entertainer / animator
    2019 Aug – Sep, Hotel Gran Castillo Tagoro (Lanzarote), Acttiv
    • organising and leading programs for kids and adults
    • public speaking (e.g. at evening shows)
    • working in a team of 15 entertainers
    • communicating with clients
    Languages
    • English • Hungarian • Spanish

    Interests

    • Dancing
    • Basketball
    • Hiking
    • Skiing

    Stock price prediction & classification

    2021, private project

    Developing a statistical CNN model with various features (e.g. multiscale) to predict the stock price movement of the next day of a given market based on the previous X days.

    CNN / MCNN / PyTorch / stock / time series

    Name and tech skills detection in CVs

    2021, Hackerintro

    Using spaCy we built a rule-based model and trained a custom NER statistical model to detect the name of individuals along with their IT related skills in CVs / resumes.

    NLP / spaCy / Named Entity Recognition / NER