wingsite

Automatically Sending Scans to Paperless

published: 2024-09-22
tags: ftp printers self-hosting

I've been looking for some form of document management for a while since I've amassed a lot of documents over the years ranging from receipts to medical files, user manuals and whatever else I felt like saving. A few months ago friend Archie pointed out Paperless-ngx to me and I finally got around to trying it out a few weeks ago.

It's exactly what I was looking for as it turned out :D Paperless is... exactly what it sounds like. You put documents in and can tag them, assign correspondents and get a nice web interface so you can easily access them from multiple devices. It also auto tags things with local machine learning.

The interface is simple, after logging in you get a dashboard showing various statistics about the documents you already have, as well an upload button for adding to it. I still find it a bit unusual that the first page you see isn't the actual document list, which is the second link in the navigation menu, but that's fine.

I had a thought after using Paperless for a little bit, what if I could automatically put the things I'm scanning into Paperless instead of what I was doing: Scanning each thing to my computer then uploading them to Paperless. Well as it turns out, Paperless has a consume folder that it scans every so often and does just that with anything it finds in there!

ProFTPD

I recently acquired a new printer/scanner that includes the ability to send scans to a network share via FTP, SharePoint or a samba share. I don't run any of these because I have no other need for them but I chose to set up an FTP server because it seemed like the simplest and after a quick look decided to use ProFTPD.

I run Debian so it was a just sudo apt install proftpd to get it installed. Now, while there certainly seems to be a lot of possible configuration, I didn't modify anything. I grabbed an FTP client to make sure it worked and logged in my regular linux username and password, which worked just fine and I had access to everything my regular had access to. It's been a very long time since I used FTP, it was way back while I was barely in secondary school and didn't really know how anything worked but hey, it was quite an easy setup as it turns out. Of course it's FTP, I didn't bother setting up FTPS as it can't be accessed outside my LAN I'm not particularly worried about it being accessed or spied on.

I did later decide to restrict the accessible directories afterwards so the FTP connection can only access the consume folder.

In order to do so you can create a file like custom.conf in /etc/proftpd/conf.d/ and enter the follow:

DefaultRoot /path/to/paperless/consume/

That's it!

Printer

My old all-in-one printer broke about two months ago and it was the last shitty inkjet I was going to own. I bought a laser all-in-one printer, a Brother DCP-L3520CDW, printers are always well named 🙄. I didn't buy it from Brother, weirdly everywhere else I looked had it cheaper than getting it directly. It was still pricey but after a little over a month of owning it I can safely say it was well worth it. I have never had a printer that just actually worked like this one, on phones and (linux) desktops, it just shows up on the network and you can print, this shouldn't be noteworthy but it unfortunately is. Of course on phones you still need the app if you want to scan and save documents directly to your phone but that's not required for my use case at least.

Anyway, as previously mentioned, the scanner can send things over the network and it's pretty easy to set up. You first head over to the web interface and login then go to 'Scan' > 'Scan to FTP/Network/Sharepoint' to make sure the first profile is set to FTP. To configure the connect go to 'Scan' > 'Scan to FTP/Network/Sharepoint Profile' click the first one then fill in the details, they're all pretty self-explanatory. If you've restricted your FTP server to just the consume directory like I did you can leave the 'Store Directory' field as '/'. I personally set most of the options as 'User Select' because I don't want to create multiple profiles and it just lets me set them on a per scan basis.

Now to actually scan something! Just hit the scan button, use the arrows to select 'Scan to FTP' then you will get to configure any options you left as 'User Select' when setting up the profile and finally you'll be told to press the start button. You can keep scanning pages until you select 'No (Send)' which will actually send the scan over FTP. As long you don't see a sending failed message, it should have temporarily appeared in your consumed then promptly been imported :D

It all works perfectly and has definitely made scanning new documents a faster and nicer experience even if I do still have to change the auto assigned tags sometimes, though that's getting better the more I import. Overall, very happy with the setup ^_^