karnwong.me Open in urlscan Pro
2a06:98c1:3121::3  Public Scan

Submitted URL: http://karnwong.me/
Effective URL: https://karnwong.me/
Submission: On August 27 via api from US — Scanned from NL

Form analysis 0 forms found in the DOM

Text Content

Karn Wong

 * Now
 * About
 * Archives
 * Projects
 * Speaking
 * Tags


KARN WONG

Platform engineer | Technical Leadership | HashiCorp Ambassador | AWS & GCP 2X
Certified



REASONS WHY YOU SHOULDN'T USE PROGRAMMING LANGUAGES FOR IAC

When it comes to IaC (infrastructure as code), most people might have heard of
HashiCorp’s Terraform (it uses HCL as DSL. Interestingly enough, Terraform also
has its own CDK to translate programming languages into HCL), Pulumi or AWS CDK.
The latter two support programming languages as DSL. Mostly there are two camps:
People who swear by HCL and think you shouldn’t use programming languages for
IaC People who don’t see why you need to pick up a new language in order to use
IaC, so they prefer using a programming language they already are familiar with
instead Both camps are not wrong, they are both valid....

August 5, 2024 · 2 min · Karn Wong


GCP'S SERVICE ACCOUNT CREDENTIALS CAN BE A SECURITY RISK. HERE'S HOW TO MITIGATE
THEM.

If you look online, many sources would tell you that you should use service
account to authenticate for GCP services. While this is true, it’s not for all
the cases. For local development, you should use Application Default Credentials
Imagine working in a team, and you have to work with Cloud Run, so you request
your infra team for a service account. This looks good, but then your teammates
also have to work with this service....

July 14, 2024 · 2 min · Karn Wong


THOUGHTS ON SUMMARIZATION SERVICE SYSTEM DESIGN

For a summarization task, there should be an input, in which it’s reduced to a
handful of paragraphs. This input is in text format. You don’t necessarily start
from a text format though, since the source content can be audio or video files.
But this means at the end, the source input has to be converted into a text
format, and this involves a transcription task. Transcription means taking an
audio, then convert it to text....

June 9, 2024 · 3 min · Karn Wong


FASTER SPARK WORKLOADS WITH COMET

For big data processing, spark is still king. Over the years, many improvements
have been made to improve spark performance. Databricks themselves created
photon, a spark engine that can accelerate spark queries, but this is
proprietary to Databricks. Other alternatives do exist (see here for more
details), but they are not trivial to setup. But if you use Apache Arrow
DataFusion Comet, surprisingly it does not take much time at all to setup....

April 7, 2024 · 2 min · Karn Wong


SLIM DOWN PYTHON DOCKER IMAGE SIZE WITH POETRY AND PIP

Python package management is not straightforward, seeing default package manager
(pip) does not behave like node’s npm, in a sense that it doesn’t track
dependencies versions. This is why you should use poetry to manage python
packages, since it creates a lock file, so you can be sure that on every
re-install, the versions would be the same. However, this poses a challenge when
you want to create a docker image with poetry, because you need to do an extra
pip install poetry (unless you bake this into your base python image)....

April 7, 2024 · 2 min · Karn Wong
Next  »
© 2024 Karn Wong · Powered by Hugo & PaperMod