On 2020-05-21 16:57, davidson wrote:
On Thu, 21 May 2020, David Christensen wrote:On 2020-05-21 08:52, davidson wrote:On Thu, 14 May 2020, Albretch Mueller wrote:The thing is that I have to call, say sha256sum, on millions of files Probably debian admin people dealing with packaging have to deal with the same kinds of issues.For checksums, mtree(8) from package mtree-netbsd might be worth a look.Been there, done that; I do not recommend it: https://lists.debian.org/debian-user/2020/01/msg00488.htmlThe thread you refer to reports problems with the mtree-à-la-FreeBSD ("fmtree(8)" [1]) in debian package freebsd-buildutils. mtree-netbsd is a different debian package, providing mtree-à-la-NetBSD ("mtree(8)" [2]). It does not seem to suffer from the deficiency you encountered with fmtree. 1. https://manpages.debian.org/buster/freebsd-buildutils/fmtree.8.en.html 2. https://manpages.debian.org/buster/mtree-netbsd/mtree.8.en.html
Thanks for the tip. [2] is older than [1]. Both are older than the version on FreeBSD: https://www.freebsd.org/cgi/man.cgi?mtree(8)I cannot remember if I found them both when I went looking for mtree(8) on Debian, but I would have picked the newer of the two.
I was trying to validate migration of ~0.9 TB of content from a Debian Samba server to a FreeBSD Samba server. I know I failed. I seem to recall it was due to a missing feature in the Debian version of mtree(8).
Also, I do not believe the input/ output format of mtree(8) is compatible with the I/O format of sha256sum(1). Using mtree(8) output as sha256sum(1) input, or vice-versa, requires a translation command or script.
I do seem to recall writing a Perl script to parse mtree(8) output. The mtree(8) convert option '-C' was the key. The Debian version I tried lacked it. The other version seems to have it. So, maybe...
I think the simplest answer on Debian is to use find(1), xargs(1) (with the -P option), and sha256sum(1) to generate an SHA256SUMS file.
However, before I learned of mtree(8), I wrote a Perl script to perform essentially the same function -- compare metadata and checksums of two directory trees, or the same tree at two different points in time. I soon discovered how wasteful it is to recompute checksums for 0.9 TB of files (hours) when only a tiny fraction have changed (seconds or minutes). So, I added an update feature to the Perl script. This made the script far more efficient, and therefore usable. AFAIK no version of mtree(8) has this feature. A find(1), xargs(1), and sha256sum(1) pipeline would also lack this feature, and an SHA256SUMS file lacks the metadata fields required to implement it.
David