[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#776658: marked as done (lintian.d.o: Use database to reduce memory footprint)



Your message dated Sun, 19 Apr 2020 17:38:17 -0700
with message-id <CAFHYt576-3Lh3eStL2hFUECMLB24kWYJqJqdDns_V5_ZcXK1=w@mail.gmail.com>
and subject line Results of Lintian's archive scan now in Sqlite3 database
has caused the Debian Bug report #776658,
regarding lintian.d.o: Use database to reduce memory footprint
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact owner@bugs.debian.org
immediately.)


-- 
776658: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=776658
Debian Bug Tracking System
Contact owner@bugs.debian.org with problems
--- Begin Message ---
Package: lintian
Version: 2.5.30+deb8u3
Severity: important

The reporting framework consumes a rather substantial amount of
memory.

The harness process itself hogs ~1GB of RAM.  This in itself is not
concerning.  However, it retains this usage even while running lintian
and html_reports.  For the former, it "just" needs the current "work
queue" in memory.  For the latter, it should not need any memory worth
mentioning.

The html_reports process itself consumes up to 2GB while processing
templates.  It is possible that there is nothing we can do about that
as there *is* a lot of data in play.  But even then, we can free it as
soon as possible (so we do not keep it while running gnuplot at the
end of the run).

Currently, when harness -i runs, the gnuplot process seems to die for
"no apparent" reason.  I suspect it is OOM'ed though harness +
html_reports "only" consumes 65-70%ish of the memory available and
gnuplot seems fairly cheap memory-wise in comparison.
  When running harness -r alone, harness skips parts of the code that
makes it consume memory and that seems to be suffient to making
html_reports + gnuplot terminate successfully.

~Niels

--- End Message ---
--- Begin Message ---
Hi,

On 2017-07-16, Niels Thykier wrote:
>
> I think a database might be the only realistic way forward

The reporting framework now uses an Sqlite3 database. The framework is
separate from Lintian. There is a tag sieve that scans the archive
[1], and a public website to inspect the results. [2] Both are
connected via an Sqlite3 database.

[1] https://salsa.debian.org/lechner/taxiv
[2] https://salsa.debian.org/lechner/detagtive

I will move the two repos to our team area when they are ready.

The database import is helped greatly by a new, experimental JSON
output mode in Lintian.

The Sqlite3 database appears sufficient for the time being. Its size
is approximately 230 MB with indices, and 150 MB without. Compressed,
the database is about 15 MB, and therefore just a bit larger than the
traditional lintian.log.gz. If historical information is worth
keeping, we may ask for a Postgres instance.

Also, we have a new lintian.d.o in beta. Please let us know what you think.

It is not clear that the database will reduce the memory footprint,
but I am closing this bug.

Kind regards
Felix Lechner

--- End Message ---

Reply to: