cloudflare-general-assesment.pages.dev Open in urlscan Pro
2606:4700:310c::ac42:2d09  Public Scan

URL: https://cloudflare-general-assesment.pages.dev/
Submission: On April 26 via automatic, source rescanner — Scanned from DE

Form analysis 0 forms found in the DOM

Text Content

Username: Title: Content:
Submit



Bryce Wilson's Social Media Site


bryce Systems design
Alright, now that I have some time again, let's write about how I want to design
my systems assignment. For these types of projects, I like to think through my
design beforehand and write it down. I don't leave it set in stone though.
Coding is like going on a journey through uncharted land. I can use my existing
knowledge to try to plot a course but that may be wrong. It may be that by the
time I get close, I find out that my original goal was not the best one after
all.

I manged to find a good TLS library that does exactly what I want. Since I have
that as an option, I am join to start with TLS only but adding non-TLS could be
done someday. I'll also start off with the generic back end so I can run it on
my mac and on my Linux server. The design I explain below should allow it to be
very easy to add io_uring and Network Framework (though I have not finished my
Rust bindings for Network Framework quite yet).

So the first thing that happens is the front end asks the back end to start
listening. This function is expected to return in order to support different
back ends. The back end may have to start a thread or not. On generic and
io_uring, this creates a new thread that calls listen. On Network Framework,
this will do the various configuration steps, create a listener with the
callback to be run on the utility dispatch queue (though for max performance you
would want a higher QoS queue so I'll probably make that configurable).

On the io_uring implementation, we need to create a structure for storing the
connection data. What I want is to map a u64 (the userspace identifier in
io_uring) to our own structure that will contain various bits of state. This
kind of problem could be solved with a huge number of different data structures
with different performance and space characteristics. I'm just going to do what
is the easiest for me. So my plan is to use a basic dynamic array of an enum
which either contains the client entry or a pointer / optional reference to the
next item in the free list. When we need to create a new entry, if there is
something on the list, we take it, otherwise, expand the list. This isn't the
best in many ways but it should work fine. If I had more time, I could write a
much better unsafe version. Maybe I will do it if I finish everything else. It
would be a good tool to have for other situations.

So the basic idea is that the TLS end controls everything. It starts by
scheduling a read from the network. When it gets new packets, it processes them,
possibly sends data to the front. The front tells TLS if it needs to send
something and if it wants more data or needs to do more processing. The TLS
knows when it needs to read or write to the network so it will send that off as
needed. It can also get a signal from the front to end the connection or if it
gets confused, it will just do it on its own. Since this works by the front
returning state information to TLS, if the back has its own TLS implementation
(like in Network Framework), it can just call the front directly.

The first thing that happens when we receive a connection is that we call into
the TLS system. TLS immediately returns saying we need a read. That is
scheduled. Once the read happens, we call into TLS which may call into the
front, etc. So I guess it's a bit like a state machine. I like state machines so
that makes sense. On generic, we create a new thread for each connection and
that thread just calls TLS directly. This is not great but remember that generic
is just a reference implementation that it not intended to be used since almost
all OSs have something better. As I said before, I want to implement an io_uring
version for linux and maybe a Network Framework version for macOS but if I had
time, Windows has IO Completion Ports and BSD (and macOS but I'm assuming
macOS's own Framework is better which may be true) has kqueue.

At this point, I think that I just need to start writing it. I'll make some more
posts if things change or I have any other ideas.
2307596732superlongtestusernamesuperlongtestusernamesuperlongtestusername Test
Post
<script>alert("test")</script>
bryce Quick XSS test
<script>alert("test")</script>
bryce Systems!
I really love working with systems. To implement what you requested using
libraries would be quite easy. I could spend time doing the integration but I
want to show off what I am the best at.

Basically, my plan is to go low-level and implement my own IO and protocol
handling. I'll still leave TCP to the OS but I will do the rest myself.

I plan to implement two or network backends. The generic backend is the most
simple. It creates a thread that calls Rust's built-in listen. The resulting
streams are sent using a Single Producer, Multiple Consumer queue that I wrote
for another project. These are picked up by a pool of listener threads. There is
one of these for each possible simultaneous connection. These read from the
connection, put the data into a pre-allocated structure for that connection, and
then call back out to the front which parses and does any other processing
before returning to the back to read more or close the connection.

When building for Linux, a special Linux-only backend will be used. This will
call listen in a thread as before, but now we will use io_uring for the reads
and writes. io_uring is the fastest IO on Linux and really any OS since it uses
ring buffers shared by the OS and the program. This means that processing
threads can pick things off of the completion queue and put things onto the
submission queue and handle many more requests before any blocking has to
happen. As far as I know, none of the HTTP libraries for Rust use this method.

If I get time, I will build a macOS version as well using the Network Framework.
This is actually the first implementation that I have been working on for my
mail server project. It's a lot harder because Apple didn't really expect people
to use this outside of C, Objective-C, and Swift. It's possible but I've had to
write my own ffi because those I found in existing crates just don't work for my
use case.

I don't have a Windows machine myself, but if I did, I would build a backend
similar to the one for Linux using IO Completion ports. But until then, they
will be stuck with the generic backend.

I have a rough idea of how the frontend will work but I can't seem to find the
words to describe it. I'll make some update posts as I go. I'll probably use a
library for the JWTs but I'll have to write my own HTML library since all the
ones I found don't allow for custom backends. This means I'll have to stick to
HTTP/1 which isn't great. Maybe if I have time, I'll learn enough crypto to get
HTTP/2 working. I'll have to see.

I'll have to see how this all goes. I always like to learn new things and this
happens to be an area that I am actively researching for my mailserver project
so there are many unknowns. I think it's better to do something new each time
than to just do things that you already know how to do.
bryce Page design
I based the design of my page on a simplified version of Instagram. It looks
nothing like it but I had to start somewhere. If I had more time (and was
applying for a design job), I would make it look a lot nicer. I actually really
like doing web design. I'm not great at it but I did design a very cool page for
a class as well as the page for my EVIX project (see https://evix.org). In all
of my coding, I am a big fan of doing things from scratch. So while I did use
SCSS as a preprocessor, I wrote all of the SCSS myself, no mixins. I also wrote
my own HTML by hand as well. I think it's really fun to do.

Since I didn't use any libraries at all, the functionality is pretty basic.
Showing and hiding the new post menu is actually just a CSS thing. When you
click the plus, it changes the URL fragment to #post which causes this menu to
no longer be display: none. Clicking submit or X then just changes to a
different fragment.

I'm not great at JS so I just wrote the most basic things that I could manage. I
have a function which is called when pressing submit which creates the POST
request from the fields and sends it then closes the post menu. When the post it
sent, it calls another function to update the posts displayed on the page.

On the page load and on submitting a new post, we refresh the posts on the page.
I set up a get request to get the JSON for the posts. Once I've got that, I
clear the existing posts and add in each of the new ones. I just use string
interpolation to create the new objects. I know using a template tag would
probably be the best but I wanted to do something I was confident in.

I'm overall quite happy with how this turned out. I'm not a font-end guy at all.
I usually stick to systems level stuff like my current personal project which is
writing an IMAP/SMTP server. Since I prefer systems, I am going to work more on
the Systems challenge and only come back here to implement comments and
reactions if I finish. I'll detail my systems plans in my next post.
bryce Backend design
For my worker, I decided to use Rust. While Rust is a more recent language for
me, I am really enjoying using it. I hope to be able to use it for more personal
projects and maybe even in my work. Rust is compiled to WASM with a JS shim to
use on Workers.

Now that I'm done, I think that were this a full project, I would rewrite it a
bit differently. I would still use Rust but I would write my own custom
bindings. Since even the Rust API for Workers still has to bridge back to JS,
there are a lot of places where the most convenient thing is not the fastest. I
won't go through all of them but overall, I think custom bindings are the best
for now. Maybe WASM will eventually allow web APIs that don't have to go through
JS at all which would eliminate any of these concerns.

Anyway, let's look at what I did. I used the new Rust bindings that Cloudflare
put out recently. This made it very easy to log my request, route it to the
endpoint, and handle the request. Rust can deal with async quite well and
internally creates the call-back structure needed to use async JS functions even
though WASM functions can't be async themselves. This was a major issue for me
when I first tried to write a worker in C a couple years ago before I learned
Rust. I know now a way to do it but it is still not as convenient as in Rust.

For the posts GET endpoint, I grab the KV store for posts and list them. I sort
them based on the timestamp so most recent is first. I then go through the posts
and actually get the full post from the KV store using some convenient Rust
functions to ignore any errors. I create some CORS headers (for all endpoints)
since I couldn't get pages working on my custom domain. Finally, convert the Vec
into JSON and send it off!

For the POST endpoint, I Deserialize the JSON and allow serde to ensure it is in
the right format. I then check that the username doesn't contain a newline.
Special characters are ok, I do escape before inserting HTML so I'm not too
worried about XSS. I should probably remove other control characters but I don't
right now. I format the key, use that to insert into the KV store, then send out
a basic response with the header mentioned before.

I also accept OPTIONS for the posts endpoint to send the needed headers.
Anything else is just passed through unchanged though I'm not sure that I've
done that correctly. I'm not entirely clear on the API.

That's about it, the rest is just from the Rust Workers template which adds some
better error reporting.
1788963964superlongtestusernamesuperlongtestusernamesuperlongtestusername Test
Post
For this version of my site, I just have one KV store. It uses keys which are
the username of the poster, a newline as a separator, and a unix millis
timestamp. This allows users to make up to one post per millisecond which I
think is reasonable. It also would allow listing posts by user in the future. I
used a newline as the separator because it seemed like a reasonable thing to
restrict from the username. Currently the website will not tell you that the
username is invalid.

Each value is just the post object to be displayed. In the future, I might add a
score of some sort similar to Reddit's score based on up and down votes. The
issue with doing this directly is that the KV is eventually consistent which
means that atomic operations like increment do not work well. I could use
durable objects if they were available to free accounts but they are not.

My solution to this would be to create a separate KV namespace which would have
keys containing the post ID, and username making the reaction. The value would
be if it was an up or down vote or none (if someone added a vote then removed
it). This would make it easy to look up if a user made a reaction so that could
be displayed. I would then have a scheduled worker action tally up the reactions
for each post and add that data to the post. It could do this somewhat
infrequently since the total counts don't have to always be accurate, just
eventually consistent which is what KV is good at. By having the displayed
counts on the post object directly, the client would not have to cause a list
for every single post to tally up all of the reactions. They would only need to
check the reaction table for the user's own reaction which is a simple get.

Comments are quite easy since you can list by prefix. I would just have a KV
namespace with keys that are the post id then some sort of comment id (probably
comment id and time). You can then run a list with the post id as a prefix to
get all of the comments.

One of the nice things about having time for all of these is that I can sort
posts (and later comments) by when they were posted with the most recent at the
top.

While I am probably not going to implement images, I would do so using
Cloudflare's Images service which allows convenient storage and resizing of
images.
bryce My KV Thoughts
For this version of my site, I just have one KV store. It uses keys which are
the username of the poster, a newline as a separator, and a unix millis
timestamp. This allows users to make up to one post per millisecond which I
think is reasonable. It also would allow listing posts by user in the future. I
used a newline as the separator because it seemed like a reasonable thing to
restrict from the username. Currently the website will not tell you that the
username is invalid.

Each value is just the post object to be displayed. In the future, I might add a
score of some sort similar to Reddit's score based on up and down votes. The
issue with doing this directly is that the KV is eventually consistent which
means that atomic operations like increment do not work well. I could use
durable objects if they were available to free accounts but they are not.

My solution to this would be to create a separate KV namespace which would have
keys containing the post ID, and username making the reaction. The value would
be if it was an up or down vote or none (if someone added a vote then removed
it). This would make it easy to look up if a user made a reaction so that could
be displayed. I would then have a scheduled worker action tally up the reactions
for each post and add that data to the post. It could do this somewhat
infrequently since the total counts don't have to always be accurate, just
eventually consistent which is what KV is good at. By having the displayed
counts on the post object directly, the client would not have to cause a list
for every single post to tally up all of the reactions. They would only need to
check the reaction table for the user's own reaction which is a simple get.

Comments are quite easy since you can list by prefix. I would just have a KV
namespace with keys that are the post id then some sort of comment id (probably
comment id and time). You can then run a list with the post id as a prefix to
get all of the comments.

One of the nice things about having time for all of these is that I can sort
posts (and later comments) by when they were posted with the most recent at the
top.

While I am probably not going to implement images, I would do so using
Cloudflare's Images service which allows convenient storage and resizing of
images.
bryce Another interesting post
Some content here
brycemw A test post
This is a test post to see if I can now actually make posts that work!
bar foo
blah