I've been digging into Ruby's stdlib RSS parser for a side project and am very
impressed by the overall experience. Here's how easy it is to get started:
That said, doing something interesting with the resulting feed is not quite so
simple.
For one, you can't just support RSS. Atom is a more recent standard used by many
blogs (although I think irrelevant in the world of podcasts). There's about a
50% split in the use of RSS and Atom in the tiny list of feeds that I follow, so
a feed reader must handle both formats.
Adding Atom support introduces an extra branch to our snippet:
The need to handle both standards independently is kind of frustrating.
That said, it does make sense from a library perspective. The RSS gem is
principally concerned with parsing XML per the RSS and Atom standards, returning
objects that correspond one-to-one. Any conveniences for general feed reading
are left to the application.
Wrapping the RSS gem in another class helps encapsulate differences in
standards:
Worse than dealing with competing standards is the fact that not everyone
publishes the content of an article as part of their feed. Many bloggers only
use RSS as a link aggregator that points subscribers to their webpage, omitting
the content entirely:
<rssversion="2.0"><channel><title>Redacted Blog</title><link>https://www.redacted.io</link><description>This is my blog</description><item><title>Article title goes here</title><link>https://www.redacted.io/this-is-my-blog</link><pubDate>Thu, 25 Jul 2024 00:00:00 GMT</pubDate><!-- No content! --></item></channel></rss>
How do RSS readers handle this situation? The solution varies based on the app.
The two I've tested, NetNewsWire and Readwise Reader, manage to include the
entire article content in the app, despite the RSS feed omitting it (assuming no
paywalls). My guess is these services make an HTTP request to the source,
scraping the resulting HTML for the article content and ignoring everything
else.
Firefox users are likely familiar with a feature called
Reader View
that transforms a webpage into its bare-minimum content. All of the layout
elements are removed in favor of highlighting the text of the page. The JS
library that Firefox uses is open source on their Github:
mozilla/readability.
On the Ruby side of things there's a handy port called
ruby-readability that we can use
to extract omitted article content directly from the associated website:
require"ruby-readability"URI.open("https://jvns.ca/atom.xml")do|raw|
feed =RSS::Parser.parse(raw)
url =case feed
whenRSS::Rss
feed.items.first.link
whenRSS::Atom::Feed
feed.entries.first.link.href
end# Raw HTML content
source =URI.parse(url).read
# Just the article HTML content
article_content = Readability::Document.new(source).content
end
So far the results are good, but I haven't tested it on many blogs.
Today I found myself at the bottom of a rabbit hole, exploring how
Zod's refine method interacts with form validations. As with
most things in programming, reality is never as clear-cut as the types make it
out to be.
Today's issue concerns
zod/issues/479, where refine
validations aren't executed until all fields in the associated object are
present. Here's a reframing of the problem:
The setup:
I have a form with fields A and B. Both are required fields, say required_a
and required_b.
I have a validation that depends on the values of both A and B, say
complex_a_b.
The problem:
If one of A or B is not filled out, the form parses with errors: [required_a],
not [required_a, complex_a_b]. In other words, complex_a_b only pops up as
an error when both A and B are filled out.
Here's an example schema that demonstrates the problem:
This creates an experience where a user fills in A, submits, sees a validation
error pointing at B, fills in B, and sees another validation error pointing at
complex_a_b. The user has to play whack-a-mole with the form inputs to make
sure all of the fields pass validation.
As a programmer, we're well-acquainted with error messages that work like this.
And we hate them! Imagine a compiler that suppresses certain errors before
prerequisite ones are fixed.
If you dig deep into the aforementioned issue thread, you'll come across the
following solution (credit to
jedwards1211):
From a type perspective, I understand why Zod doesn't endeavor to fix this
particular issue. How can we assert the types of A or B when running the
complex_a_b validation, if types A or B are implicitly optional? To evaluate
them optionally in complex_a_b would defeat the type, z.string(), that
asserts that the field is required.
How did I fix it for my app? I didn't. I instead turned to the form library,
applying my special validation via the form API instead of the Zod API. I
concede defeat.
Stumbled on the emacs-aio library today
and it's introduction post. What a
great exploration into how async/await works under the hood! I'm not sure I
totally grok the details, but I'm excited to dive more into Emacs generators and
different concurrent programming techniques.
The article brings to mind Wiegley's
async library, which is probably the
more canonical library for handling async in Emacs. From a brief look at the
README, async looks like it actually spawns independent processes, whereas
emacs-aio is really just a construct for handling non-blocking I/O more
conveniently.
I've written small-medium sized packages -- 400 to 2400 lines of elisp -- that
use generators and emacs-aio (async/await library built on generator.el) for
their async capabilities. I've regretted it each time: generators in their
current form in elisp are obfuscated, opaque and not introspectable -- you
can't debug/edebug generator calls. Backtraces are impossible to read because
of the continuation-passing macro code. Their memory overhead is large
compared to using simple callbacks. I'm not sure about the CPU overhead.
That said, the simplicity of emacs-aio promises is very appealing:
(defunaio-promise()"Create a new promise object."(record'aio-promisenil()))(defsubst aio-promise-p (object)(and(eq'aio-promise(type-of object))(=3(length object))))(defsubst aio-result (promise)(aref promise 1))
Lichess is an awesome website, made even more awesome by
the fact that it is free and open source. Perhaps lesser known is that the
entire Lichess puzzle database is available for free download under the Creative
Commons CC0 license. Every puzzle that you normally find under
lichess.org/training is available for your
perusal.
This is a quick guide for pulling that CSV and seeding a SQLite database so you
can do something cool with it. You will need
zstd.
First, wget the file from
Lichess.org open database and save it
into a temporary directory. Run zstd to uncompress it into a CSV that we can
read via Ruby.
A separate seed script pulls items from the CSV and bulk-inserts them into
SQLite. I have the following in my db/seeds.rb, with a few omitted additions
that check whether or not the puzzles have already been migrated.
I've blogged before about why I really dislike apps like Notion for
taking quick notes since they're so slow to
open. The very act of opening the app to take said note often takes 10 or more
seconds, typically with a whole bunch of JavaScript-inflicted loading states and
blank screens. By the time I get to the note, I've already lost my train of
thought.
As it turns out, this painpoint is a perfect candidate for the iOS Shortcuts
app. I can create an automated workflow that captures my text input instantly
but pushes to Notion in the background, allowing me to benefit from Notion's
database-like organization but without dealing with the pitiful app performance.
Type predicates
have been around but today I found a particularly nice application. The
situation is this: I have an interface that has an optional field, where the
presence of that field means I need to create a new object on the server, and
the lack of the field means the object has already been created and I'm just
holding on to it for later. Here's what it looked like:
The intersection type Thing & { blob: File } means that uploadNewThings only
accepts things that have the field blob. In other words, things that need to
be created on the server because they have blob content.
However, TypeScript struggles if you try to simply filter the list of things
before passing it into uploadNewThings:
Argument of type 'Thing[]' is not assignable to parameter of type '(Thing & { blob: File; })[]'.
Type 'Thing' is not assignable to type 'Thing & { blob: File; }'.
Type 'Thing' is not assignable to type '{ blob: File; }'.
Types of property 'blob' are incompatible.
Type 'File | undefined' is not assignable to type 'File'.
Type 'undefined' is not assignable to type 'File'.
The tl;dr being that despite filtering things by thing => !!thing.blob,
TypeScript does not recognize that the return value is actually
Thing & { blob: File }.
But casting is bad! It's error-prone and doesn't really solve the problem that
TypeScript is hinting at. Instead, use a type predicate:
const hasBlob =(t: Thing): t is Thing &{ blob: File }=>!!t.blob
uploadNewThings(things.filter(hasBlob))
With the type predicate (t is Thing & ...) I can inform TypeScript that I do
in fact know what I'm doing, and that the call to filter results in a
different interface.
Most runners run not because they want to live longer, but because they want
to live life to the fullest. If you're going to while away the years, it's far
better to live them with clear goals and fully alive than in a fog, and I
believe running helps you do that. Exerting yourself to the fullest within
your individual limits: that's the essence of running, and a metaphor for
life—and for me, writing as well. - Haruki Murakami
What I traditionally would've used Rake tasks for has been replaced with
data-migrate, a little gem that
handles data migrations in the same way as Rails schema migrations. It's the
perfect way to automate data changes in production, offering a single pattern
for handling data backfills, seed scripts, and the like.
The pros are numerous:
Data migrations are easily generated via CLI and are templated with an up
and down case so folks think about rollbacks.
Just like with Rails schema migrations, there's a migration ID kept around
that ensures data migrations are run in order. Old PRs will have merge
conflicts.
You can conditionally run data migrations alongside schema migrations with
bin/rails db:migrate:with_data.
It's a really neat gem. I'll probably still rely on the good ol' Rake task for
my personal projects, but will doubtless keep data-migrate in the toolbox for
teams.
On the inside cover of Kafka on the Shore Murakami explains how his idea for
the book started with its title. This approach is opposite to anything I've ever
written, though I recognize there's a notable difference between fiction and
technical writing. But what a powerful idea: a simple phrase shapes the entire
story.
I dug up this quote from an interview:
When I start to write, I don’t have any plan at all. I just wait for the story
to come. I don’t choose what kind of story it is or what’s going to happen. I
just wait.