Fetch a Mastodon account's posts and save them as text files using Mastodon's [statuses API](https://docs.joinmastodon.org/methods/accounts/#statuses).
- Supports all parameters in Mastodon's statuses API
- Convert post to markdown
- Customize output file location, name, and extension
- Customize output format and front matter
- Optionally download of post media
- Optionally threading of posts
- Optionally filter based on post visibility
- Optional affordances for scripting
- Optionally persist fetched post id cursors
I use this tool to create an archive of my Mastodon posts and [syndicate them to my own site](https://garrido.io/microblog/), per IndieWeb's [PESOS philosophy](https://indieweb.org/PESOS).
[Go](https://go.dev/doc/install) is required for installation.
You can clone this repo and run `go build main.go` in the repository's directory, or you can run `go install git.garrido.io/gabriel/mastodon-markdown-archive@latest` to install a binary of the latest version.
I use this tool programatically, and I do not want to recreate the archive from scratch each time. I thread posts, exclude replies to others, exclude reblogs, and filter out any post that is not public.
Mastodon imposes a maximum limit of 40 posts in this API. With `--persist-first` and `--persist-last` I can save cursors of the upper and lower bound of posts that were fetched. I then use the API's `max-id`, `min-id`, and `since-id` parameters to get the posts that I need, depending on each case.
Calling this for the first time will fetch the most recent 40 posts. With `--persist-last./last`, the oldest fetched post id will be saved at `./last`. Caling this command again will set the `last` cursor to the oldest post of the next 40 posts, and so on.
You can use a simple bash script to automate this process. Adding the `--porcelain` flag prints the amount of fetched posts to stdout, which can then be used to continue or stop fetching posts:
Alternatively, posts can be threaded together using the `--threaded=true` flag. With threading, the descendants of a post will not be written out as a separate files. Instead, only the top post will be written out.
The program will aggregate the post's descendants in reverse chronological order and make them available in the template via the [Descendants](https://pkg.go.dev/git.garrido.io/gabriel/mastodon-markdown-archive/client#Post.Descendants) method. This can be used in [templates](#templating) to render threaded posts as a single post, which the [default template does](./files/templates/post.tmpl#L33).
When threading, the `AllMedia` and `AllTags` methods will yield the aggregated [MediaAttachment](https://pkg.go.dev/git.garrido.io/gabriel/mastodon-markdown-archive/client#MediaAttachment) and [Tag](https://pkg.go.dev/git.garrido.io/gabriel/mastodon-markdown-archive/client#Tag), respectively.
When the `--visibility` flag is used, only the top post's visibility is evaluated. This is done explicitly to support the common practice in Mastodon of setting threaded replies as `unlisted`.
Because of this limit, it is possible that posts in a thread end up split across different responses. Or, a user may maintain a long-lived thread of posts that gets updated sporadically and thus rarely will a single batch of posts have all the descendants of the post.
An orphaned post is a post whose parent is not within a batch of posts returned by a single API call.
In either case, the program will fallback to using the [status context](https://docs.joinmastodon.org/methods/statuses/#context) endpoint to rebuild the corresponding thread from the top.
The contents of the file and the filename for each post can be customized using templates. This provides enough flexibility to use this tool for various purposes. The templates are evaluated as Go [text templates](https://pkg.go.dev/text/template), so it should be possible to do anything that's normally supported in a Go template.
For example, if you're using this to syndicate posts to a site built using a static site generator, you can customize the output so that it adheres to specific requirements around front matter structure or filename formats.
Out of the box, this tool uses the [post.tmpl](./files/templates/post.tmpl) template to create the post file. It converts the post content to markdown, threads replies, and defines some attributes in the front matter using YAML.
Back at dual-booting on the [#FrameworkLaptop](https://social.coop/tags/FrameworkLaptop). Last time it was Ubuntu, but now I have gone with [#Fedora](https://social.coop/tags/Fedora) 40 KDE.
I'm impressed with how things just work with this laptop. Major props to the [@frameworkcomputer](https://fosstodon.org/@frameworkcomputer) team for supporting these distros out of the box.
I simply decrypted my drive, shrunk it, created a partition, booted off a USB key, installed Fedora, encrypted both partitions, and that's it.
Also, KDE Plasma 6 looks incredibly crisp on this screen.
```
A different template can be used by passing its path to `--template`. The template must comply with Go template syntax.
Back at dual-booting on the [#FrameworkLaptop](https://social.coop/tags/FrameworkLaptop). Last time it was Ubuntu, but now I have gone with [#Fedora](https://social.coop/tags/Fedora) 40 KDE.
I'm impressed with how things just work with this laptop. Major props to the [@frameworkcomputer](https://fosstodon.org/@frameworkcomputer) team for supporting these distros out of the box.
I simply decrypted my drive, shrunk it, created a partition, booted off a USB key, installed Fedora, encrypted both partitions, and that's it.
Also, KDE Plasma 6 looks incredibly crisp on this screen.
Out of the box, this tool uses the post's id and the `.md` extension for the filename. For example, this [post](https://social.coop/@ggpsv/112326240503555949) is saved `112326240503555949.md`
A different format for the filename can be used by passing a template string to `--filename`. The string must comply with Go template syntax.
For example, to create post files that are prefixed with the post's creation date in `YYYY-MM-DD` format and suffixed with the post id, pass `--filename='{{.Post.CreatedAt | date "2006-01-02"}}-{{.Post.Id}}.md`.
Following the HTML example in the [post template section](#post) above, you may customize the filename as `--filename='{{.Post.Id}}.html'` to use HTML as the output file extension.
By default, a post's media is not downloaded. Use the `--download-media` flag with a path to download a post's media. The post's original file is downloaded, and the image's id is used as the filename.
For example, `--download-media=./images` saves any media to the `./images`.
Once downloaded, the media's path is available in [MediaAttachment.Path](https://pkg.go.dev/git.garrido.io/gabriel/mastodon-markdown-archive/client#MediaAttachment) as an absolute path.
Sprig's [path](https://masterminds.github.io/sprig/paths.html) functions can be used in the templates to manipulate the path as necessary. For example, the [default template](https://git.hq.ggpsv.com/gabriel/mastodon-markdown-archive/src/branch/main/files/templates/post.tmpl#L25-L27) uses `osBase` to get the last element of the filepath.
### Bundling
You can use `--download-media=bundle` to save the post media in a single directory with its original post. In this case, the post's filename will be used as the directory name and the post filename will be `index.{extension}`.
For example, `--download-media="./bundle" --filename='{{ .Post.CreatedAt | date "2006-01-02" }}-{{.Post.Id}}.md'` will create a `YYYY-MM-DD-<post id>/` directory, with the post saved as `YYYY-MM-DD-<post id>/index.md` and media saved as `YYYY-MM-DD-<post id>/<media id>.<media ext>`.