I started this video as a review of my previous day – part of the January Coding Challenge. But in the end, I started covering code review topics for my ETL pipeline for fetching and processing URLs. I also cover how I plan my coding tasks, how I use GitHub issues and projects for keeping track of future and current features.
In this video post, I will cover the following topics.
- Code review of data streaming pipeline for fetching URLs, compressing its data, and saving for further processing.
- Use of URL parser to get text, image, link information from the web-page (demo).
- Also, I discuss open tasks for marking processed URLs for further indexing.
- Review of previously finished tasks.
- How I use GitHub issues and projects to manage work on my research project.