blog.det.life
Open in
urlscan Pro
162.159.153.4
Public Scan
Submitted URL: http://blog.det.life/
Effective URL: https://blog.det.life/?gi=19c554cf5094
Submission: On August 16 via manual from MA — Scanned from DE
Effective URL: https://blog.det.life/?gi=19c554cf5094
Submission: On August 16 via manual from MA — Scanned from DE
Form analysis
0 forms found in the DOMText Content
To make Medium work, we log user data. By using Medium, you agree to our Privacy Policy, including cookie policy. Homepage Open in app Sign inGet started DATA ENGINEER THINGS INSIGHTS AND IDEAS ON DATA AND ENGINEERING. ETLData ArchitectureOptimizationInterview GuideCareer GrowthAI in Data EngineeringAboutContribute FollowFollowing How does Notion handle 200 billion data entities? HOW DOES NOTION HANDLE 200 BILLION DATA ENTITIES? From PostgreSQL → Data Lake Vu Trinh Aug 6 Trending Now No, Data Engineers Don’t NEED dbt. NO, DATA ENGINEERS DON’T NEED DBT. But It Sure Does Solve a Lot of Problems Leo Godin Jul 19 A Practitioner’s Guide to Developing Data Engineering Solutions with Databricks A PRACTITIONER’S GUIDE TO DEVELOPING DATA ENGINEERING SOLUTIONS WITH DATABRICKS Development Approaches, Environments, CI/CD and Testing with Databricks Eduard Popa Jul 25 Apache Kafka — Important Designs APACHE KAFKA — IMPORTANT DESIGNS Filesystem, Zero-copy, and Batching Vu Trinh Jul 13 Diving Deep into LinkedIn’s Data Infrastructure: My 6-Hour Learning & Key Takeaways DIVING DEEP INTO LINKEDIN’S DATA INFRASTRUCTURE: MY 6-HOUR LEARNING & KEY TAKEAWAYS Things I distill after reading the paper: Data Infrastructure at LinkedIn Vu Trinh Aug 3 Netflix Maestro and Apache Airflow — Competitors or Companions in Workflow Orchestration? NETFLIX MAESTRO AND APACHE AIRFLOW — COMPETITORS OR COMPANIONS IN WORKFLOW ORCHESTRATION? How Netflix Maestro and Apache Airflow complement each other. Delve into their features, strengths, and use cases. Volker Janz Jul 29 Getting Started with APIs for Data Engineers GETTING STARTED WITH APIS FOR DATA ENGINEERS So what are APIs and what do they do? Aminat Lawal Jul 26 Latest stories Creating Business Value with Databricks: The Role of Solution Architects CREATING BUSINESS VALUE WITH DATABRICKS: THE ROLE OF SOLUTION ARCHITECTS Bridging the gap between stakeholders and data teams to bring valuable data solutions into production Eduard Popa Aug 15 How Did LinkedIn Handle 7 Trillion Messages Daily With Apache Kafka? HOW DID LINKEDIN HANDLE 7 TRILLION MESSAGES DAILY WITH APACHE KAFKA? Was adding more machines enough? Vu Trinh Aug 14 Timeless Skills for Navigating the Evolving World of Data Engineering TIMELESS SKILLS FOR NAVIGATING THE EVOLVING WORLD OF DATA ENGINEERING What technologies and programming languages should you learn to become a data engineer? Ben Rogojan Aug 12 Perhaps the ultimate Orchestration Tool was in front of us all along PERHAPS THE ULTIMATE ORCHESTRATION TOOL WAS IN FRONT OF US ALL ALONG Hopefully you’ve been using this all along Hugo Lu Aug 11 I spent 4 hours learning Apache Iceberg. Here’s what I found. I SPENT 4 HOURS LEARNING APACHE ICEBERG. HERE’S WHAT I FOUND. The table format’s overview and architecture Vu Trinh Aug 10 Understanding Flight Cancellations and Rescheduling in Airlines Using Databricks and PySpark UNDERSTANDING FLIGHT CANCELLATIONS AND RESCHEDULING IN AIRLINES USING DATABRICKS AND PYSPARK Using Databricks and PySpark for Enhanced Flight Operations in the Airline Industry. Brahmareddy, The Data Engineer. Aug 9 Big-O Essentials for Data Engineers BIG-O ESSENTIALS FOR DATA ENGINEERS Essential Concepts to Enhance Your Coding Efficiency Santosh Joshi Aug 8 This is What I will do to Become a Data Engineer in 2025 THIS IS WHAT I WILL DO TO BECOME A DATA ENGINEER IN 2025 Discover how to become a data engineer with this comprehensive guide, whether you’re a beginner or an intermediate software… Syed Kadar Ansari Syed Ahamed Aug 5 Adding a custom source to PyAirbyte using the no-code builder ADDING A CUSTOM SOURCE TO PYAIRBYTE USING THE NO-CODE BUILDER Learn how to build your customized data sources with PyAirbyte Felix Gutierrez Aug 4 Make Your Own Data Diff CLI from Scratch using DBT, Snowflake and Python: Part 1 MAKE YOUR OWN DATA DIFF CLI FROM SCRATCH USING DBT, SNOWFLAKE AND PYTHON: PART 1 Background Matthew Macias Aug 1 Eliminate Data Errors: Four SQL Techniques to Enhance Data Quality ELIMINATE DATA ERRORS: FOUR SQL TECHNIQUES TO ENHANCE DATA QUALITY Introduction Rajanikant Vellaturi Jul 30 Batch to Streaming eTL with Redpanda Connect BATCH TO STREAMING ETL WITH REDPANDA CONNECT This post covers a demo of how to convert a batch delivered complex CSV file into a realtime stream in AVRO format. It is split into… Mark Olliver Jul 29 Migrating Your Existing ELT Data Pipeline to PyAirbyte MIGRATING YOUR EXISTING ELT DATA PIPELINE TO PYAIRBYTE Leverage the power of data integration with PyAirbyte Felix Gutierrez Jul 27 Apache Kafka — Consumer APACHE KAFKA — CONSUMER The clients who read Vu Trinh Jul 27 Perfect Data Pipeline: How to Build Them Nearly Flawless PERFECT DATA PIPELINE: HOW TO BUILD THEM NEARLY FLAWLESS Great for data engineers aiming to optimize data workflows and decision-making processes in their projects. Rui Carvalho Jul 26 A Data Quality Starter Toolkit: Building Trustworthy Data with YData, Soda, and pandas A DATA QUALITY STARTER TOOLKIT: BUILDING TRUSTWORTHY DATA WITH YDATA, SODA, AND PANDAS A hands-on walkthrough of some key data quality tooling Eva Revear Jul 21 Stream Processing Systems: RisingWave vs ksqlDB STREAM PROCESSING SYSTEMS: RISINGWAVE VS KSQLDB Understand the differences between ksqlDB and RisingWave, two powerful streaming systems to decide the right solution for your use case. RisingWave Labs Jul 21 CAP theorem — What Every Data Engineer Should Know CAP THEOREM — WHAT EVERY DATA ENGINEER SHOULD KNOW A Data Engineer’s Guide to Balancing Consistency, Availability, and Partition Tolerance Santosh Joshi Jul 20 Apache Kafka — Producer APACHE KAFKA — PRODUCER The clients who write Vu Trinh Jul 20 Exploring Advanced Open Data Formats: Apache Hudi, Apache Iceberg, and Delta Lake EXPLORING ADVANCED OPEN DATA FORMATS: APACHE HUDI, APACHE ICEBERG, AND DELTA LAKE Discover the in-depth details of Apache Hudi, Apache Iceberg, and Delta Lake Syed Kadar Ansari Syed Ahamed Jul 18 DBT isn’t dynamic: Part 2 DBT ISN’T DYNAMIC: PART 2 A full data pipeline explanation Cai Parry-Jones Jul 13 How I built a Scalable, Robust, and Cost-Effective Data Platform for a Fintech Company HOW I BUILT A SCALABLE, ROBUST, AND COST-EFFECTIVE DATA PLATFORM FOR A FINTECH COMPANY In today’s data-driven business world, having a unified data platform is crucial for effectively storing and processing data based on… Sainath Jul 12 Execute Azure Data Factory REST APIs with Python EXECUTE AZURE DATA FACTORY REST APIS WITH PYTHON Understand the step-by-step process to execute Azure Data Factory (ADF) REST APIs with Python. The guide includes multiple working… Rahul Madhani Jul 12 Building a Local Data Lake from scratch with MinIO, Iceberg, Spark, StarRocks, Mage, and Docker BUILDING A LOCAL DATA LAKE FROM SCRATCH WITH MINIO, ICEBERG, SPARK, STARROCKS, MAGE, AND DOCKER Hello again, fellow technology enthusiasts! I am a software/data engineer who transitioned from data science. The learning curve in this… George Zefkilis Jul 12 Data Modeling with Snowflake: A concise critical review DATA MODELING WITH SNOWFLAKE: A CONCISE CRITICAL REVIEW Should you skip, borrow, or buy? Chad Isenberg Jul 12 Data Engineer Things Things learned in our data engineering journey and ideas on data and engineering. More information Followers 7.3K Elsewhere About Data Engineer ThingsLatest StoriesArchiveAbout MediumTermsPrivacyTeams