DVDs working

Audio CDs are working well!
Refactor. CD maybe working?
2022-08-25 10:17:25 -06:00 · 2022-08-25 10:17:25 -06:00 · 2022-08-25 10:17:25 -06:00 · 2022-08-25 10:17:25 -06:00
11 changed files with 406 additions and 288 deletions
--- a/.vscode/settings.json
+++ b/.vscode/settings.json
@ -10,6 +10,7 @@
        "gnudb",
        "newfn",
        "RDONLY",
+        "cdparanoia",
        "TTITLE"
    ]
 }
--- a/13
+++ b/13
@ -7,24 +7,17 @@ RUN true \
  && sed -i 's/main$/main contrib non-free/' /etc/apt/sources.list \
  && apt-get -y update \
  && DEBIAN_FRONTEND=noninteractive apt-get --no-install-recommends -y install \
-    ffmpeg \
-    handbrake-cli libavcodec-extra \
-    abcde eyed3 \
-    glyrc setcd eject \
    dvdbackup \
    libdvd-pkg libdvdcss2 \
+    handbrake-cli libavcodec-extra \
+    cd-discid cdparanoia lame \
    python3 \
-    cowsay \
+    python3-slugify \
  && true
 RUN dpkg-reconfigure libdvd-pkg

 RUN true \
  && DEBIAN_FRONTEND=noninteractive apt-get --no-install-recommends -y install \
-    lame \
-    busybox \
-    jq \
-    procps \
-    moreutils \
    cowsay

 COPY src/* /app/
--- a/README.md
+++ b/README.md
@ -11,9 +11,23 @@ and then re-encode the content to a compressed format.

 At the time I'm writing this README, it will:

-* ~~Rip audio CDs, look them up in cddb, encode them to VBR MP3, then tag them.~~ A rewrite broke this; I plan to fix it soon.
+* Rip audio CDs, look them up in cddb, encode them to VBR MP3, then tag them. 
+  * It also writes a shell script you can modify to quickly change the tags, since this is a pretty common thing to want to do.
 * Rip video DVDs, transcode them to mkv

+## Requirements
+
+The requirements are fairly light: a few CD tools, cdparanoia, HandBrakeCLI, and some
+DVD libraries.
+
+Most notably, you do *not* need a relational database (SQLite, Postgres, MySQL).
+You just need a file system.
+
+For a complete list of requirements,
+look at the [Dockerfile](Dockerfile) 
+to see what Debian packages it installs.
+
+
 ## How To Run This

 You need a place to store your stuff.
@ -27,9 +41,8 @@ Mine is `/srv/ext/incoming`.
        -v /srv/ext/incoming:/incoming \
        registry.gitlab.com/dartcatcher/media-sucker/media-sucker

-I can't get it to work with docker swarm.
-Presumably some magic is happening with `--device`.
-It probably has something to do with selinux.
+I can't get it to work with docker swarm,
+which doesn't support `--device`.

 Stick a video DVD or audio CD in,
 and the drive should spin up for a while,
@ -39,9 +52,14 @@ or a new directory of `.mp3` files (for audio).

 You can watch what it's doing at http://localhost:8080/

+
 ## A note on filenames and tags

 This program does the absolute minimum to try and tag your media properly.
+Partly because I'm a lazy programmer,
+but mostly because the computer can only guess at things that you,
+the operator,
+can just read off the box.

 For DVDs, that means reading the "title" stored on the DVD,
 which I've seen vary from very helpful (eg. "Barbie A Fashion Fairytale")
@ -55,13 +73,10 @@ so CDDB takes the length of every track in seconds and tries to match that
 against something a user has uploaded in the past.
 This is wrong a whole lot of the time.

-If CDDB can't find a match for an audio CD,
-this program will append the datestamp of the rip to the album name,
-in the hopes that you can remember about what time you put each CD in the drive.
-So for stuff like multi-CD audiobooks, that's pretty helpful.
-
 But the end result in almost every case is that you're going to have to
-manually edit the metadata.
+rename the movie file, or re-tag the audio files.
+This is why you get a `tag.sh` file with every audio CD rip.
+

 ## Answers

@ -69,35 +84,23 @@ I'm skipping the part where I make up questions I think people might have.

 ### Why I Wrote This

-The `automatic-ripping-machine` looks really badass.
+The automatic-ripping-machine looks really badass.
 But after multiple attempts across multiple months
 to get it running,
 I decided it would probably be faster just to write my own.

-This isn't as cool as the aumomatic-ripping-machine.
+media-sucker isn't as cool as the automatic-ripping-machine.
 But, at least for me,
-it's a lot more functional,
-in that it actually does something.
+it's more useful,
+in that I can get it to actually do something.

 ### Why You Should Run This

 The only reason I can think of that anybody would want to use this is if they,
 like me,
-are too dumb to get the `automatic-ripping-machine` to work.
+are too dumb to get the automatic-ripping-machine to work.

 ### What Kind Of Hardware I Use

 I run it on a Raspberry Pi 4,
 with a Samsung DVD drive from the stone age.
-
-
-## Parting note
-
-As of 2022-08-22, large sections of this code were written under COVID brain-fog.
-This means it's going to look a lot like a 13-year-old wrote it.
-
-I hope one day to clean it up a bit,
-but it's working fairly well, 
-despite the mess.
-Please don't judge me for the organization of things.
-Judge bizarro universe Neale instead.
--- a/doc/architecture.md
+++ b/doc/architecture.md
@ -0,0 +1,54 @@
+# Web Server
+
+There is one web server,
+which provides static content,
+and a single entrypoint for dynamic state information.
+
+The static content is some HTML and JavaScript,
+which the browser runs to pull the dynamic state,
+and update the page with current status of everything.
+
+
+# Workers
+
+There are at least two Workers:
+a Reader and an Encoder.
+Each Worker runs in its own thread,
+and can do its job without interfering with another Worker.
+
+## Readers
+
+Readers monitor a device for media.
+Right now, those devices are always CD-ROM drives.
+As soon as media is inserted,
+a MediaHandler is created to scan and then copy it.
+
+## Encoders
+
+Encoders wait for jobs to show up,
+and then they re-invoke a MediaHandler to encode everything in that job.
+
+
+# MediaHandlers
+
+MediaHandlers have a work directory,
+where they store all their stuff.
+They have the following stages of execution:
+
+1. *scan* the media to figure out its title, list of tracks, and other metadata
+2. *copy* the media to the work directory
+3. *encode* the work directory into the desired format (eg. MP3, MKV)
+4. *clean* the work directory
+
+Before each step,
+state is read out of the work directory.
+
+During each step,
+a MediaHandler continually updates its Worker with a completion percentage.
+This is passed up to the Web Server's dynamic state.
+
+After each step, 
+a MediaHandler updates its state,
+which is stored on disk.
+The only way to communicate state between execution stages is by writing to disk.
+This provides some tolerance of job interruption, power loss, etc.
--- a/src/cd.py
+++ b/src/cd.py
@ -13,7 +13,7 @@ SECOND = 1
 MINUTE = 60 * SECOND
 HOUR = 60 * MINUTE

-def read(device, status):
+def scan(state, device):
    # Get disc ID
    p = subprocess.run(
        [
@ -23,8 +23,9 @@ def read(device, status):
        encoding="utf-8",
        capture_output=True,
    )
-    discid = p.stdout
-    status["discid"] = discid
+    discid = p.stdout.strip()
+    state["discid"] = discid
+    cddb_id = discid.split()[0]

    # Look it up in cddb
    email = os.environ.get("EMAIL") # You should really set this variable, tho
@ -42,22 +43,23 @@ def read(device, status):
        # We're expected to be automatic here,
        # so just use the first one.
        for k in ("title", "artist", "genre", "year", "tracks"):
-            status[k] = disc[k]
+            state[k] = disc[k]
    else:
-        now = time.strftime("%Y-%m-%dT%H%M%S")
        num_tracks = int(discid.split()[1])
-        status["title"] = "Unknown CD - %s" % now
-        status["tracks"] = [""] * num_tracks
+        state["title"] = "Unknown CD - %s" % cddb_id
+        state["tracks"] = ["Track %02d" % (i+1) for i in range(num_tracks)]

-def rip(device, status, directory):
+
+def copy(state, device, directory):
    # cdparanoia reports completion in samples
    # use discid duration to figure out total number of samples
-    duration = int(status["discid"].split()[-1]) * SECOND # disc duration in seconds
+    duration = int(state["discid"].split()[-1]) * SECOND # disc duration in seconds
    total_samples = duration * (75 / SECOND) * 1176 # 75 sectors per second, 1176 samples per sector
+    state["total_samples"] = total_samples

    track_num = 1
-    for track_name in status["tracks"]:
-        logging.debug("Ripping track %d of %d", track_num, len(status["tracks"]))
+    for track_name in state["tracks"]:
+        logging.debug("Ripping track %d of %d", track_num, len(state["tracks"]))
        p = subprocess.Popen(
            [
                "cdparanoia",
@ -75,56 +77,110 @@ def rip(device, status, directory):
            line = line.strip()
            if line.startswith("##: -2"):
                samples = int(line.split()[-1])
-                status["complete"] = samples / total_samples
+                yield samples / total_samples

        track_num += 1

-def encode(status, directory):
-    # Encode the tracks
+
+def encode(state, directory):
    track_num = 1
-    for track_name in status["tracks"]:
+    total_tracks = len(state["tracks"])
+    durations = [int(d) for d in state["discid"].split()[2:-1]]
+    total_duration = sum(durations)
+    encoded_duration = 0
+    
+    tag_script = io.StringIO()
+    tag_script.write("#! /bin/sh\n")
+    tag_script.write("\n")
+    tag_script.write("ALBUM=%s\n" % state["title"])
+    tag_script.write("ARTIST=%s\n" % state.get("artist", ""))
+    tag_script.write("GENRE=%s\n" % state.get("genre", ""))
+    tag_script.write("YEAR=%s\n" % state.get("year", ""))
+    tag_script.write("\n")
+    
+    for track_name in state["tracks"]:
+        logging.debug("Encoding track %d (%s)" % (track_num, track_name))
+        duration = durations[track_num-1]
        argv = [
            "lame",
+            "--brief",
+            "--nohist",
+            "--disptime", "1",
            "--preset", "standard",
-            "-tl", status["title"],
-            "--tn", "%d/%d" % (track_num, len(status["tracks"])),
+            "--tl", state["title"],
+            "--tn", "%d/%d" % (track_num, total_tracks),
        ]
-        if status["artist"]:
-            argv.extend(["-ta", status["artist"]])
-        if status["genre"]:
-            argv.extend(["-tg", status["genre"]])
-        if status["year"]:
-            argv.extend(["-ty", status["year"]])
+        tag_script.write("id3v2")
+        tag_script.write(" --album \"$ALBUM\"")
+        tag_script.write(" --artist \"$ARTIST\"")
+        tag_script.write(" --genre \"$GENRE\"")
+        tag_script.write(" --year \"$YEAR\"")
+        if state.get("artist"):
+            argv.extend(["--ta", state["artist"]])
+        if state.get("genre"):
+            argv.extend(["--tg", state["genre"]])
+        if state.get("year"):
+            argv.extend(["--ty", state["year"]])
        if track_name:
-            argv.extend(["-tt", track_name])
-            outfn = "%d - %s.mp3" % (track_num, track_name)
+            argv.extend(["--tt", track_name])
+            tag_script.write(" --song \"%s\"" % track_name)
+            outfn = "%02d - %s.mp3" % (track_num, track_name)
        else:
-            outfn = "%d.mp3" % track_num
+            outfn = "%02d.mp3" % track_num
        argv.append("track%02d.cdda.wav" % track_num)
        argv.append(outfn)
+        tag_script.write("\\\n    ")
+        tag_script.write(" --track %d/%d" % (track_num, total_tracks))
+        tag_script.write(" \"%s\"\n" % outfn)
+
        p = subprocess.Popen(
            argv,
            cwd = directory,
-            stdin = subprocess.PIPE,
+            stderr = subprocess.PIPE,
            encoding = "utf-8",
        )
-        p.communicate(input=track_name)
+        for line in p.stderr:
+            line = line.strip()
+            if "%)" in line:
+                p = line.split("(")[1]
+                p = p.split("%")[0]
+                pct = int(p) / 100
+                yield (encoded_duration + (duration * pct)) / total_duration
+
+        encoded_duration += duration
        track_num += 1

+    with open(os.path.join(directory, "tag.sh"), "w") as f:
+        f.write(tag_script.getvalue())        
+
+
+def clean(state, directory):
+    for fn in os.listdir(directory):
+        if fn.endswith(".wav"):
+            os.remove(os.path.join(directory, fn))
+
 if __name__ == "__main__":
    import pprint
+    import sys
+    import json

    logging.basicConfig(level=logging.DEBUG)
-    status = {}
-    read("/dev/sr0", status)
-    pprint.pprint(status)

-    directory = os.path.join(".", status["title"])
+    state = {}
+    scan(state, "/dev/sr0")
+    pprint.pprint(state)
+
+    directory = os.path.join(".", state["title"])
    os.makedirs(directory, exist_ok=True)
-    rip("/dev/sr0", status, directory)
-    pprint.pprint(status)
+    with open(os.path.join(directory, "state.json"), "w") as f:
+        json.dump(f, state)

-    encode(status, directory)
-    pprint.pprint(status)
+    for pct in copy(state, "/dev/sr0", directory):
+        sys.stdout.write("Copying: %3d%%\r" % (pct*100))
+    pprint.pprint(state)
+
+    for pct in encode(state, directory):
+        sys.stdout.write("Encoding: %3d%%\r" % (pct*100))
+    pprint.pprint(state)

 # vi: sw=4 ts=4 et ai
--- a/src/dvd.py
+++ b/src/dvd.py
@ -10,165 +10,151 @@ SECOND = 1
 MINUTE = 60 * SECOND
 HOUR = 60 * MINUTE

-class Copier:
-    def __init__(self, device, status):
-        self.device = device
-        self.status = status
-        self.scan()
-
-    def collect(self, track):
-        newCollection = []
-        for t in self.collection:
-            if t["length"] == track["length"]:
-                # If the length is exactly the same,
-                # assume it's the same track,
-                # and pick the one with the most stuff.
-                if len(track["audio"]) < len(t["audio"]):
-                    return
-                elif len(track["subp"]) < len(t["subp"]):
-                    return
-            newCollection.append(t)
-        newCollection.append(track)
-        self.collection = newCollection           
-
-    def scan(self):
-        self.status["state"] = "scanning"
-
-        self.collection = []
-        p = subprocess.run(
-            [
-                "lsdvd",
-                "-Oy",
-                "-x",
-                self.device,
-            ], 
-            encoding="utf-8",
-            capture_output=True,
-        )
-        lsdvd = eval(p.stdout[8:]) # s/^lsdvd = //
-        title = lsdvd["title"]
-        if title in ('No', 'unknown'):
-            title = lsdvd["provider_id"]
-            if title == "$PACKAGE_STRING":
-                title = "DVD"
-        now = time.strftime("%Y-%m-%dT%H%M%S")
-        title = "%s %s" % (title, now)
-
-        # Go through all the tracks, looking for the largest referenced sector.
-        max_sector = 0
-        max_length = 0
-        tracks = lsdvd["track"]
-        for track in tracks:
-            max_length = max(track["length"], max_length)
-            for cell in track["cell"]:
-                max_sector = max(cell["last_sector"], max_sector)
-        if max_sector == 0:
-            logging.info("Media size = 0; aborting")
-            return
-
-        # Make a guess about what's on this DVD.
-        # We will categories into three types:
-        # * A feature, which has one track much longer than any other
-        # * A collection of shows, which has several long tracks, more or less the same lengths
-        # * Something else
-        for track in tracks:
-            if track["length"] / max_length > 0.80:
-                self.collect(track)
-        if (max_length < 20 * MINUTE) and (len(self.collection) < len(track) * 0.6):
-            self.collection = tracks
-
-        self.status["title"] = title
-        self.status["size"] = max_sector * 2048 # DVD sector size = 2048
-        self.status["tracks"] = [(t["ix"], t["length"]) for t in self.collection]
+def collect(collection, track):
+    newCollection = []
+    for t in collection:
+        if t["length"] == track["length"]:
+            # If the length is exactly the same,
+            # assume it's the same track,
+            # and pick the one with the most stuff.
+            if len(track["audio"]) < len(t["audio"]):
+                return collection
+            elif len(track["subp"]) < len(t["subp"]):
+                return collection
+        newCollection.append(t)
+    newCollection.append(track)
+    return newCollection


-    def copy(self, directory):
-        self.status["state"] = "copying"
+def scan(state, device):
+    p = subprocess.run(
+        [
+            "lsdvd",
+            "-Oy",
+            "-x",
+            device,
+        ], 
+        encoding="utf-8",
+        capture_output=True,
+    )
+    lsdvd = eval(p.stdout[8:]) # s/^lsdvd = //
+    title = lsdvd["title"]
+    if title in ('No', 'unknown'):
+        title = lsdvd["provider_id"]
+    if title == "$PACKAGE_STRING":
+        now = time.strftime(r"%Y-%m-%dT%H:%M:%S")
+        title = "DVD %s" % (title, now)

+    # Go through all the tracks, looking for the largest referenced sector.
+    max_sector = 0
+    max_length = 0
+    tracks = lsdvd["track"]
+    for track in tracks:
+        max_length = max(track["length"], max_length)
+        for cell in track["cell"]:
+            max_sector = max(cell["last_sector"], max_sector)
+    if max_sector == 0:
+        logging.info("Media size = 0; aborting")
+        return
+
+    # Make a guess about what's on this DVD.
+    # We will categories into three types:
+    # * A feature, which has one track much longer than any other
+    # * A collection of shows, which has several long tracks, more or less the same lengths
+    # * Something else
+    collection = []
+    for track in tracks:
+        if track["length"] / max_length > 0.80:
+            collection = collect(collection, track)
+    if (max_length < 20 * MINUTE) and (len(collection) < len(track) * 0.6):
+        collection = tracks
+
+    state["title"] = title
+    state["size"] = max_sector * 2048 # DVD sector size = 2048
+    state["tracks"] = [(t["ix"], t["length"]) for t in collection]
+
+def copy(state, device, directory):
+    p = subprocess.Popen(
+        [
+            "dvdbackup",
+            "--input=" + device,
+            "--name=" + state["title"],
+            "--mirror",
+            "--progress",
+        ],
+        encoding="utf-8",
+        stdout=subprocess.PIPE,
+        stderr=subprocess.STDOUT,
+        cwd=directory,
+    )
+    totalBytes = titleSize = lastTitleSize = 0
+    progressRe = re.compile(r"^Copying.*([0-9.]+)/[0-9.]+ (MiB|KiB)")
+    for line in p.stdout:
+        line = line.strip()
+        m = progressRe.search(line)
+        if m and m[2] == "MiB":
+            titleSize = float(m[1]) * 1024 * 1024
+        elif m and m[2] == "KiB":
+            titleSize = float(m[1]) * 1024
+        if titleSize < lastTitleSize:
+            totalBytes += lastTitleSize
+        lastTitleSize = titleSize
+        yield (totalBytes + titleSize) / state["size"]
+
+
+def encode(state, directory):
+    title = state["title"]
+    logging.info("encoding: %s (%s)" % (title, directory))
+
+    total_length = sum(t[1] for t in state["tracks"])
+    finished_length = 0
+    for track, length in state["tracks"]:
+        outfn = "%s-%d.mkv" % (title, track)
+        tmppath = os.path.join(directory, outfn)
+        outpath = os.path.join(directory, "..", outfn)
        p = subprocess.Popen(
            [
-                "dvdbackup",
-                "--input=" + self.device,
-                "--name=" + self.status["title"],
-                "--mirror",
-                "--progress",
+                "nice",
+                "HandBrakeCLI",
+                "--json",
+                "--input", "%s/%s/VIDEO_TS" % (directory, state["title"]),
+                "--output", tmppath,
+                "--title", str(track),
+                "--native-language", "eng",
+                "--markers",
+                "--loose-anamorphic",
+                "--all-subtitles",
+                "--all-audio",
+                "--aencoder", "copy",
+                "--audio-copy-mask", "aac,ac3,mp3",
+                "--audio-fallback", "aac",
            ],
            encoding="utf-8",
            stdout=subprocess.PIPE,
-            stderr=subprocess.STDOUT,
-            cwd=directory,
+            stderr=None,
        )
-        totalBytes = titleSize = lastTitleSize = 0
-        progressRe = re.compile(r"^Copying.*([0-9.]+)/[0-9.]+ (MiB|KiB)")
+
+        # HandBrakeCLI spits out sort of JSON.
+        # But Python has no built-in way to stream JSON objects.
+        # Hence this kludge.
+        progressRe = re.compile(r'^"Progress": ([0-9.]+),')
        for line in p.stdout:
            line = line.strip()
            m = progressRe.search(line)
-            if m and m[2] == "MiB":
-                titleSize = float(m[1]) * 1024 * 1024
-            elif m and m[2] == "KiB":
-                titleSize = float(m[1]) * 1024
-            if titleSize < lastTitleSize:
-                totalBytes += lastTitleSize
-            lastTitleSize = titleSize
-            self.status["complete"] = (totalBytes + titleSize) / self.status["size"]
+            if m:
+                progress = float(m[1])
+                yield (finished_length + progress*length) / total_length
+
+        finished_length += length
+        os.rename(
+            src=tmppath,
+            dst=outpath,
+        )
+        logging.info("Finished track %d; length %d" % (track, length))


-class Encoder:
-    def __init__(self, basedir, status):
-        self.basedir = basedir
-        self.status = status
-
-    def encode(self, obj):
-        title = obj["title"]
-        logging.info("encoding: %s (%s)" % (title, self.basedir))
-
-        total_length = sum(t[1] for t in obj["tracks"])
-        finished_length = 0
-        for track, length in obj["tracks"]:
-            outfn = "%s-%d.mkv" % (title, track)
-            tmppath = os.path.join(self.basedir, outfn)
-            outpath = os.path.join(self.basedir, "..", outfn)
-            p = subprocess.Popen(
-                [
-                    "nice",
-                    "HandBrakeCLI",
-                    "--json",
-                    "--input", "%s/VIDEO_TS" % self.basedir,
-                    "--output", tmppath,
-                    "--title", str(track),
-                    "--native-language", "eng",
-                    "--markers",
-                    "--loose-anamorphic",
-                    "--all-subtitles",
-                    "--all-audio",
-                    "--aencoder", "copy",
-                    "--audio-copy-mask", "aac,ac3,mp3",
-                    "--audio-fallback", "aac",
-                ],
-                encoding="utf-8",
-                stdout=subprocess.PIPE,
-                stderr=None,
-            )
-
-            # HandBrakeCLI spits out sort of JSON.
-            # But Python has no built-in way to stream JSON objects.
-            # Hence this kludge.
-            progressRe = re.compile(r'^"Progress": ([0-9.]+),')
-            for line in p.stdout:
-                line = line.strip()
-                m = progressRe.search(line)
-                if m:
-                    progress = float(m[1])
-                    complete = (finished_length + progress*length) / total_length
-                    self.status["complete"] = complete
-
-            finished_length += length
-            os.rename(
-                src=tmppath,
-                dst=outpath,
-            )
-            logging.info("Finished track %d; length %d" % (track, length))
-
+def clean(state, directory):
+    os.removedirs(directory)

 if __name__ == "__main__":
    import pprint
--- a/src/encoder.py
+++ b/src/encoder.py
@ -1,7 +1,6 @@
 #! /usr/bin/python3

 import os
-import threading
 import subprocess
 import glob
 import os
@ -13,40 +12,47 @@ import re
 import logging
 import dvd
 import cd
+import traceback
+import worker

-class Encoder(threading.Thread):
-    def __init__(self, directory=None, **kwargs):
+class Encoder(worker.Worker):
+    def __init__(self, directory=None):
        self.status = {}
        self.directory = directory
-        return super().__init__(**kwargs)
+        return super().__init__(directory)

    def run(self):
        while True:
            wait = True
            self.status = {"type": "encoder", "state": "idle"}
-            for fn in glob.glob(os.path.join(self.directory, "*", "sucker.json")):
-                fdir = os.path.dirname(fn)
-                with open(fn) as f:
-                    obj = json.load(f)
-                self.encode(fdir, obj)
+            for fn in glob.glob(self.workdir("*", "sucker.json")):
+                directory = os.path.dirname(fn)
+                state = self.read_state(directory)
+                try:
+                    self.encode(directory, state)
+                except Exception as e:
+                    logging.error("Error encoding %s: %s" % (directory, e))
+                    logging.error(traceback.format_exc())
                wait = False
            if wait:
                time.sleep(12)

-    def encode(self, fdir, obj):
+    def encode(self, directory, state):
        self.status["state"] = "encoding"
-        self.status["title"] = obj["title"]
-        if obj["type"] == "audio":
-            self.encode_audio(fdir, obj)
-        else:
-            self.encode_video(fdir, obj)
-        shutil.rmtree(fdir)
-   
-    def encode_audio(self, fdir, obj):
-        cd.encode(obj, fdir)
+        self.status["title"] = state["title"]

-    def encode_video(self, fdir, obj):
-        enc = dvd.Encoder(fdir, self.status)
-        enc.encode(obj)
+        if state["video"]:
+            media = dvd
+        else:
+            media = cd
+
+        logging.info("Encoding %s (%s)" % (directory, state["title"]))
+        for pct in media.encode(state, directory):
+            self.status["complete"] = pct
+
+        media.clean(state, directory)
+        self.clear_state(directory)
+
+        logging.info("Finished encoding")

 # vi: sw=4 ts=4 et ai
--- a/src/reader.py
+++ b/src/reader.py
@ -1,7 +1,6 @@
 #! /usr/bin/python3

 import os
-import threading
 import subprocess
 import time
 import re
@ -9,8 +8,10 @@ import fcntl
 import traceback
 import json 
 import logging
+import slugify
 import dvd
 import cd
+import worker

 CDROM_DRIVE_STATUS = 0x5326
 CDS_NO_INFO = 0
@ -28,25 +29,21 @@ CDS_DATA_2 = 102
 CDROM_LOCKDOOR = 0x5329
 CDROM_EJECT = 0x5309

-class Reader(threading.Thread):
-    def __init__(self, device, directory=None, **kwargs):
+class Reader(worker.Worker):
+    def __init__(self, device, directory):
+        super().__init__(directory)
        self.device = device
-        self.directory = directory
-        self.status = {
-            "type": "reader",
-            "state": "idle",
-            "device": self.device,
-        }
+        self.status["type"] = "reader"
+        self.status["device"] = device
        self.complete = 0
        self.staleness = 0
        self.drive = None
        logging.info("Starting reader on %s" % self.device)
-        return super().__init__(**kwargs)

    def reopen(self):
        if (self.staleness > 15) or not self.drive:
            if self.drive:
-                self.drive.close()
+                os.close(self.drive)
                self.drive = None
            try:
                self.drive = os.open(self.device, os.O_RDONLY | os.O_NONBLOCK)
@ -69,24 +66,22 @@ class Reader(threading.Thread):
                rv = fcntl.ioctl(self.drive, CDROM_DISC_STATUS)
                try:
                    if rv == CDS_AUDIO:
-                        self.handle_audio()
+                        self.handle(False)
                    elif rv in [CDS_DATA_1, CDS_DATA_2]:
-                        self.handle_data()
+                        self.handle(True)
                    else:
                        logging.info("Can't handle disc type %d" % rv)
                except Exception as e:
                    logging.error("Error in disc handler: %s" % e)
                    logging.error(traceback.format_exc())
                self.eject()
-            elif rv in (CDS_TRAY_OPEN, CDS_NO_DISC):
+            elif rv in (CDS_TRAY_OPEN, CDS_NO_DISC, CDS_DRIVE_NOT_READ):
                time.sleep(3)
            else:
                logging.info("CDROM_DRIVE_STATUS: %d (%s)" % (rv, CDS_STR[rv]))
                time.sleep(3)
    
    def eject(self):
-        self.status["state"] = "ejecting"
-
        for i in range(20):
            try:
                fcntl.ioctl(self.drive, CDROM_LOCKDOOR, 0)
@ -96,32 +91,28 @@ class Reader(threading.Thread):
                logging.error("Ejecting: %v" % e)
                time.sleep(i * 5)

-    # XXX: rename this to something like "write_status"
-    def finished(self, **kwargs):
-        self.status["state"] = "finished read"
-        fn = os.path.join(self.directory, self.status["title"], "sucker.json")
-        newfn = fn + ".new"
-        with open(newfn, "w") as fout:
-            json.dump(obj=self.status, fp=fout)
-        os.rename(src=newfn, dst=fn)
-
-    def handle_audio(self):
-        self.status["video"] = False
-
+    def handle(self, video):
+        self.status["video"] = video
        self.status["state"] = "reading"
-        cd.read(self.device, self.status)
+
+        state = {}
+        state["video"] = video
+        if video:
+            media = dvd
+        else:
+            media = cd
+
+        media.scan(state, self.device)
+        self.status["title"] = state["title"]
+        subdir = slugify.slugify(state["title"])
+        workdir = self.workdir(subdir)
+        os.makedirs(workdir, exist_ok=True)
        
-        directory = os.path.join(self.directory, status["title"])
-        os.makedirs(directory, exist_ok=True)
        self.status["state"] = "copying"
-        cd.copy(self.device, self.status, self.directory)
-        self.finished() # XXX: rename this to something like "write_status"
+        for pct in media.copy(state, self.device, workdir):
+            self.status["complete"] = pct

+        self.write_state(subdir, state)

-    def handle_data(self):
-        self.status["video"] = True
-        src = dvd.Copier(self.device, self.status)
-        src.copy(self.directory)
-        self.finished()

 # vi: sw=4 ts=4 et ai
--- a/src/statuser.py
+++ b/src/statuser.py
@ -7,17 +7,17 @@ import time
 import os

 class Statuser(threading.Thread):
-    def __init__(self, workers, directory=None, **kwargs):
+    def __init__(self, workers, directory):
        self.workers = workers
        self.directory = directory
        self.status = {}
-        super().__init__(**kwargs)
+        super().__init__(daemon=True)

    def run(self):
        while True:
            self.status["finished"] = {
                "video": glob.glob(os.path.join(self.directory, "*.mkv")),
-                "audio": glob.glob(os.path.join(self.directory, "*/*/*.mp3")),
+                "audio": glob.glob(os.path.join(self.directory, "*/*.mp3")),
            }
            self.status["workers"] = [w.status for w in self.workers]
            time.sleep(12)
--- a/src/sucker.py
+++ b/src/sucker.py
@ -33,13 +33,9 @@ def main():

    logging.basicConfig(level=logging.INFO)

-    readers = []
-    for d in args.drive:
-        readers.append(reader.Reader(d, directory=args.incoming, daemon=True))
-    encoders = []
-    for i in range(1):
-        encoders.append(encoder.Encoder(directory=args.incoming, daemon=True))
-    st = statuser.Statuser(readers + encoders, directory=args.incoming, daemon=True)
+    readers = [reader.Reader(d, args.incoming) for d in args.drive]
+    encoders = [encoder.Encoder(args.incoming) for i in range(1)]
+    st = statuser.Statuser(readers + encoders, args.incoming)

    [w.start() for w in readers + encoders]
    st.start()
--- a/src/worker.py
+++ b/src/worker.py
@ -0,0 +1,32 @@
+import threading 
+import os
+import json
+import logging
+
+class Worker(threading.Thread):
+    def __init__(self, directory, **kwargs):
+        self.directory = directory
+        self.status = {
+            "state": "idle",
+        }
+        
+        kwargs["daemon"] = True
+        return super().__init__(**kwargs)
+
+    def workdir(self, *path):
+        return os.path.join(self.directory, *path)
+
+    def write_state(self, subdir, state):
+        logging.debug("Writing state: %s" % repr(state))
+        statefn = self.workdir(subdir, "sucker.json")
+        newstatefn = statefn + ".new"
+        with open(newstatefn, "w") as f:
+            json.dump(state, f)
+        os.rename(newstatefn, statefn)
+
+    def read_state(self, subdir):
+        with open(self.workdir(subdir, "sucker.json")) as f:
+            return json.load(f)
+
+    def clear_state(self, subdir):
+        os.unlink(self.workdir(subdir, "sucker.json"))
Author	SHA1	Message	Date
Neale Pickett	a559e92d3e	DVDs working	2022-08-25 10:17:25 -06:00
Neale Pickett	ede1fd22be	Audio CDs are working well!	2022-08-25 10:17:25 -06:00
Neale Pickett	91a332ca53	Refactor. CD maybe working?	2022-08-25 10:17:25 -06:00
Neale Pickett	2860a1405c	CD maybe mostly working?	2022-08-25 10:17:25 -06:00