Garbage commit; added junk code

2026-06-30 00:27:13 +00:00 · 2026-02-20 18:07:02 +01:00
commit 80444405e9
15 changed files with 34271 additions and 0 deletions
@@ -0,0 +1,11 @@
 # Copy your wordpress_logged_in_... cookie from browser DevTools → Storage → Cookies.
 # Paste the full name=value pair below.
 # wordpress_sec_... is the wp-admin cookie — irrelevant for read-only viewers.
 # __cf_bm is a Cloudflare bot-management cookie — also not needed.
 WP_LOGIN_COOKIE=wordpress_logged_in_<hash>=<value>
 # PeerTube upload target
 PEERTUBE_URL=https://your-peertube-instance.example
 PEERTUBE_USER=admin
 PEERTUBE_CHANNEL=your-channel-name
 PEERTUBE_PASSWORD=your-password
@@ -0,0 +1,14 @@
 ---
 alwaysApply: true
 always_on: true
 trigger: always_on
 applyTo: "**"
 description: Snyk Security At Inception
 ---
 # Project security best practices
 - Always run snyk_code_scan tool for new first party code that is generated in a Snyk-supported language.
 - If any security issues are found based on newly introduced or modified code or dependencies, attempt to fix the issues using the results context from Snyk.
 - Rescan the code after fixing the issues to ensure that the issues were fixed and that there are no newly introduced issues.
 - Repeat this process until no new issues are found.
@@ -0,0 +1,5 @@
 __pycache__/
 downloads/
 *.mp4
 *.mp4.part
 .env
@@ -0,0 +1,4 @@
 {
    "snyk.advanced.organization": "512ef4a1-6034-4537-a391-9692d282122a",
    "snyk.advanced.autoSelectOrganization": true
 }
@@ -0,0 +1,142 @@
 # 𝒥𝒶𝒾𝓁𝒷𝒾𝓇𝒹𝓏-𝒹𝓁
 Jailbirdz.com is an Arizona-based subscription video site publishing arrest and jail roleplay scenarios featuring women. This tool scrapes the member area, downloads the videos, and re-hosts them on a self-owned PeerTube instance.
 > [!NOTE]  
 > This tool does not bypass authentication, modify the site, or intercept anything it isn't entitled to. A valid, paid membership is required. The scraper authenticates using your own session cookie and accesses only content your account can already view in a browser.
 >
 > Downloading content for private, personal use is permitted in many jurisdictions under private copy provisions (e.g., § 53 UrhG in Germany). You are responsible for determining whether this applies in yours.
 ## Requirements
 - Python 3.10+
 - `pip install -r requirements.txt`
 - `playwright install firefox`
 ## Setup
 ```bash
 cp .env.example .env
 ```
 ### WP_LOGIN_COOKIE
 You need to be logged into jailbirdz.com in a browser. Then either:
 **Option A — auto (recommended):** let `grab_cookie.py` read it from your browser and write it to `.env` automatically:
 ```bash
 python grab_cookie.py              # tries Firefox, Chrome, Edge, Brave in order
 python grab_cookie.py -b firefox   # or target a specific browser
 ```
 > **Note:** Chrome and Edge on Windows 130+ require the script to run as Administrator due to App-bound Encryption. Firefox works without elevated privileges.
 **Option B — manual:** open `.env` and set `WP_LOGIN_COOKIE` yourself. Get the value from browser DevTools → Storage → Cookies while on jailbirdz.com — copy the full `name=value` of the `wordpress_logged_in_*` cookie.
 ### Other `.env` values
 - `PEERTUBE_URL` — base URL of your PeerTube instance.
 - `PEERTUBE_USER` — PeerTube username.
 - `PEERTUBE_CHANNEL` — channel to upload to.
 - `PEERTUBE_PASSWORD` — PeerTube password.
 ## Workflow
 ### 1. Scrape
 Discovers all post URLs via the WordPress REST API, then visits each page with a headless Firefox browser to intercept video network requests (MP4, MOV, WebM, AVI, M4V).
 ```bash
 python main.py
 ```
 Results are written to `video_map.json`. Safe to re-run — already-scraped posts are skipped.
 ### 2. Download
 ```bash
 python download.py [options]
 Options:
  -o, --output DIR      Download directory (default: downloads)
  -t, --titles          Name files by post title
      --original        Name files by original CloudFront filename (default)
      --reorganize      Rename existing files to match current naming mode
  -w, --workers N       Concurrent downloads (default: 4)
  -n, --dry-run         Print what would be downloaded
 ```
 Resumes partial downloads. The chosen naming mode is saved to `.naming_mode` inside the output directory and persists across runs. Filenames that would clash are placed into subfolders.
 ### 3. Upload
 ```bash
 python upload.py [options]
 Options:
  -i, --input DIR           MP4 source directory (default: downloads)
      --url URL             PeerTube instance URL (or set PEERTUBE_URL)
  -U, --username NAME       PeerTube username (or set PEERTUBE_USER)
  -p, --password SECRET     PeerTube password (or set PEERTUBE_PASSWORD)
  -C, --channel NAME        Channel to upload to (or set PEERTUBE_CHANNEL)
  -b, --batch-size N        Videos to upload before waiting for transcoding (default: 1)
      --poll-interval SECS  State poll interval in seconds (default: 30)
      --skip-wait           Upload without waiting for transcoding
      --nsfw                Mark videos as NSFW
  -n, --dry-run             Print what would be uploaded
 ```
 Uploads in resumable 10 MB chunks. After each batch, waits for transcoding and object storage to complete before uploading the next batch — this prevents disk exhaustion on the PeerTube server. Videos already present on the channel (matched by name) are skipped. Progress is tracked in `.uploaded` inside the input directory.
 ## Utilities
 ### Check for filename clashes
 ```bash
 python check_clashes.py
 ```
 Lists filenames that map to more than one source URL, with sizes.
 ### Estimate total download size
 ```bash
 python total_size.py
 ```
 Fetches `Content-Length` for every video URL in `video_map.json` and prints a size summary. Does not download anything.
 ## Data files
 | File             | Location         | Description                                                           |
 | ---------------- | ---------------- | --------------------------------------------------------------------- |
 | `video_map.json` | project root     | Scraped post URLs mapped to titles, descriptions, and video URLs      |
 | `.naming_mode`   | output directory | Saved filename mode (`original` or `title`)                           |
 | `.uploaded`      | input directory  | Newline-delimited list of relative paths already uploaded to PeerTube |
 ## FAQ
 **Is this necessary?**  
 Yes, obviously.
 **Is this project exactly what it looks like?**  
 Also yes.
 **Why go to all this trouble?**  
 Middle school girls bullied me so hard I decided if you're going to be the weird kid anyway, you might as well commit to the bit and build highly specific pipelines for highly specific content.  
 Now it's their turn to get booked.  
 Checkmate, society.  
 No apologies.
 **Why not just download everything manually?**  
 Dude.  
 Bondage fantasy.  
 Not pain play.  
 Huge difference.  
 1,300 clicks = torture.  
 Know your genres.
 ---
 This is the most normal thing I've scripted this month.
@@ -0,0 +1,159 @@
 """Filename clash detection and shared URL utilities.
 Importable functions:
    url_to_filename(url)       - extract clean filename from a URL
    find_clashes(urls)         - {filename: [urls]} for filenames with >1 source
    build_download_paths(urls, output_dir) - {url: local_path} with clash resolution
    fmt_size(bytes)            - human-readable size string
    get_remote_size(session, url) - file size via HEAD without downloading
    fetch_sizes(urls, workers, on_progress) - bulk size lookup
    make_session()             - requests.Session with required headers
    load_video_map()           - load video_map.json, returns {} on missing/corrupt
 """
 from collections import defaultdict
 from concurrent.futures import ThreadPoolExecutor, as_completed
 from pathlib import Path, PurePosixPath
 from urllib.parse import urlparse, unquote
 import json
 import requests
 from config import BASE_URL
 REFERER = f"{BASE_URL}/"
 VIDEO_MAP_FILE = "video_map.json"
 VIDEO_EXTS = {".mp4", ".mov", ".m4v", ".webm", ".avi"}
 def load_video_map():
    if Path(VIDEO_MAP_FILE).exists():
        try:
            with open(VIDEO_MAP_FILE, encoding="utf-8") as f:
                return json.load(f)
        except (json.JSONDecodeError, OSError):
            return {}
    return {}
 def make_session():
    s = requests.Session()
    s.headers.update({"Referer": REFERER})
    return s
 def fmt_size(b):
    for unit in ("B", "KB", "MB", "GB"):
        if b < 1024:
            return f"{b:.1f} {unit}"
        b /= 1024
    return f"{b:.1f} TB"
 def url_to_filename(url):
    return unquote(PurePosixPath(urlparse(url).path).name)
 def find_clashes(urls):
    # Case-insensitive grouping so that e.g. "DaisyArrest.mp4" and
    # "daisyarrest.mp4" are treated as a clash.  This is required for
    # correctness on case-insensitive filesystems (NTFS, exFAT, macOS HFS+)
    # and harmless on case-sensitive ones (ext4) — the actual filenames on
    # disk keep their original casing; only the clash *detection* is folded.
    by_lower = defaultdict(list)
    for url in urls:
        by_lower[url_to_filename(url).lower()].append(url)
    return {url_to_filename(srcs[0]): srcs
            for srcs in by_lower.values() if len(srcs) > 1}
 def _clash_subfolder(url):
    """Parent path segment used as disambiguator for clashing filenames."""
    parts = urlparse(url).path.rstrip("/").split("/")
    return unquote(parts[-2]) if len(parts) >= 2 else "unknown"
 def build_download_paths(urls, output_dir):
    """Map each URL to a local file path. Flat layout; clashing names get a subfolder."""
    clashes = find_clashes(urls)
    clash_lower = {name.lower() for name in clashes}
    paths = {}
    for url in urls:
        filename = url_to_filename(url)
        if filename.lower() in clash_lower:
            paths[url] = Path(output_dir) / _clash_subfolder(url) / filename
        else:
            paths[url] = Path(output_dir) / filename
    return paths
 def get_remote_size(session, url):
    try:
        r = session.head(url, allow_redirects=True, timeout=15)
        if r.status_code < 400 and "Content-Length" in r.headers:
            return int(r.headers["Content-Length"])
    except Exception:
        pass
    try:
        r = session.get(
            url, headers={"Range": "bytes=0-0"}, stream=True, timeout=15)
        r.close()
        cr = r.headers.get("Content-Range", "")
        if "/" in cr:
            return int(cr.split("/")[-1])
    except Exception:
        pass
    return None
 def fetch_sizes(urls, workers=20, on_progress=None):
    """Return {url: size_or_None}. on_progress(done, total) called after each URL."""
    session = make_session()
    sizes = {}
    total = len(urls)
    with ThreadPoolExecutor(max_workers=workers) as pool:
        futures = {pool.submit(get_remote_size, session, u): u for u in urls}
        done = 0
        for fut in as_completed(futures):
            sizes[futures[fut]] = fut.result()
            done += 1
            if on_progress:
                on_progress(done, total)
    return sizes
 # --------------- CLI ---------------
 def main():
    vm = load_video_map()
    urls = [u for entry in vm.values() for u in entry.get("videos", []) if u.startswith("http")]
    clashes = find_clashes(urls)
    print(f"Total URLs: {len(urls)}")
    by_name = defaultdict(list)
    for url in urls:
        by_name[url_to_filename(url)].append(url)
    print(f"Unique filenames: {len(by_name)}")
    if not clashes:
        print("\nNo filename clashes — every filename is unique.")
        return
    clash_urls = [u for srcs in clashes.values() for u in srcs]
    print(f"\n[+] Fetching file sizes for {len(clash_urls)} clashing URLs…")
    sizes = fetch_sizes(clash_urls)
    print(f"\n{len(clashes)} filename clash(es):\n")
    for name, srcs in sorted(clashes.items()):
        print(f"  {name}  ({len(srcs)} sources)")
        for s in srcs:
            sz = sizes.get(s)
            tag = fmt_size(sz) if sz is not None else "unknown"
            print(f"    [{tag}] {s}")
        print()
 if __name__ == "__main__":
    main()
@@ -0,0 +1,2 @@
 BASE_URL = "https://www.jailbirdz.com"
 COOKIE_DOMAIN = "jailbirdz.com"  # rookiepy domain filter (no www)
@@ -0,0 +1,408 @@
 """Download videos from video_map.json with resume, integrity checks, and naming modes.
 Usage:
    python download.py                        # downloads with remembered (or default original) naming
    python download.py --output /mnt/nas      # custom directory
    python download.py --titles               # switch to title-based filenames (remembers choice)
    python download.py --original             # switch back to original filenames (remembers choice)
    python download.py --reorganize           # rename existing files to match current mode
    python download.py --dry-run              # preview what would happen
    python download.py --workers 6            # override concurrency (default 4)
 """
 import argparse
 import json
 from pathlib import Path
 import re
 import shutil
 from collections import defaultdict
 from concurrent.futures import ThreadPoolExecutor, as_completed
 from check_clashes import (
    make_session,
    fmt_size,
    url_to_filename,
    find_clashes,
    build_download_paths,
    fetch_sizes,
 )
 VIDEO_MAP_FILE = "video_map.json"
 CHUNK_SIZE = 8 * 1024 * 1024
 DEFAULT_OUTPUT = "downloads"
 DEFAULT_WORKERS = 4
 MODE_FILE = ".naming_mode"
 MODE_ORIGINAL = "original"
 MODE_TITLE = "title"
 # ── Naming mode persistence ──────────────────────────────────────────
 def read_mode(output_dir):
    p = Path(output_dir) / MODE_FILE
    if p.exists():
        return p.read_text().strip()
    return None
 def write_mode(output_dir, mode):
    Path(output_dir).mkdir(parents=True, exist_ok=True)
    (Path(output_dir) / MODE_FILE).write_text(mode)
 def resolve_mode(args):
    """Determine naming mode from CLI flags + saved marker. Returns mode string."""
    saved = read_mode(args.output)
    if args.titles and args.original:
        print("[!] Cannot use --titles and --original together.")
        raise SystemExit(1)
    if args.titles:
        return MODE_TITLE
    if args.original:
        return MODE_ORIGINAL
    if saved:
        return saved
    return MODE_ORIGINAL
 # ── Filename helpers ─────────────────────────────────────────────────
 def sanitize_filename(title, max_len=180):
    name = re.sub(r'[<>:"/\\|?*]', '', title)
    name = re.sub(r'\s+', ' ', name).strip().rstrip('.')
    return name[:max_len].rstrip() if len(name) > max_len else name
 def build_title_paths(urls, url_to_title, output_dir):
    name_to_urls = defaultdict(list)
    url_to_base = {}
    for url in urls:
        title = url_to_title.get(url)
        ext = Path(url_to_filename(url)).suffix or ".mp4"
        base = sanitize_filename(title) if title else Path(url_to_filename(url)).stem
        url_to_base[url] = (base, ext)
        name_to_urls[base + ext].append(url)
    paths = {}
    for url in urls:
        base, ext = url_to_base[url]
        full = base + ext
        if len(name_to_urls[full]) > 1:
            slug = url_to_filename(url).rsplit('.', 1)[0]
            paths[url] = Path(output_dir) / f"{base} [{slug}]{ext}"
        else:
            paths[url] = Path(output_dir) / full
    return paths
 def get_paths_for_mode(mode, urls, video_map, output_dir):
    if mode == MODE_TITLE:
        url_title = build_url_title_map(video_map)
        return build_title_paths(urls, url_title, output_dir)
    return build_download_paths(urls, output_dir)
 # ── Reorganize ───────────────────────────────────────────────────────
 def reorganize(urls, video_map, output_dir, target_mode, dry_run=False):
    """Rename existing files from one naming scheme to another."""
    other_mode = MODE_TITLE if target_mode == MODE_ORIGINAL else MODE_ORIGINAL
    old_paths = get_paths_for_mode(other_mode, urls, video_map, output_dir)
    new_paths = get_paths_for_mode(target_mode, urls, video_map, output_dir)
    moves = []
    for url in urls:
        old = old_paths[url]
        new = new_paths[url]
        if old == new:
            continue
        if old.exists() and not new.exists():
            moves.append((old, new))
        # also handle .part files
        old_part = old.parent / (old.name + ".part")
        new_part = new.parent / (new.name + ".part")
        if old_part.exists() and not new_part.exists():
            moves.append((old_part, new_part))
    if not moves:
        print("[✓] Nothing to reorganize — files already match the target mode.")
        return
    print(f"[+] {len(moves)} file(s) to rename ({other_mode} → {target_mode}):\n")
    for old, new in moves:
        old_rel = old.relative_to(output_dir)
        new_rel = new.relative_to(output_dir)
        if dry_run:
            print(f"  [dry-run] {old_rel}  →  {new_rel}")
        else:
            new.parent.mkdir(parents=True, exist_ok=True)
            shutil.move(old, new)
            print(f"  ✓ {old_rel}  →  {new_rel}")
    if not dry_run:
        # Clean up empty directories left behind
        output_path = Path(output_dir)
        for old, _ in moves:
            d = old.parent
            while d != output_path:
                try:
                    d.rmdir()
                except OSError:
                    break
                d = d.parent
        write_mode(output_dir, target_mode)
        print(f"\n[✓] Reorganized. Mode saved: {target_mode}")
    else:
        print(f"\n[dry-run] Would rename {len(moves)} files. No changes made.")
 # ── Download ─────────────────────────────────────────────────────────
 def download_one(session, url, dest, expected_size):
    dest = Path(dest)
    part = dest.parent / (dest.name + ".part")
    dest.parent.mkdir(parents=True, exist_ok=True)
    if dest.exists():
        local = dest.stat().st_size
        if expected_size and local == expected_size:
            return "ok", 0
        if expected_size and local != expected_size:
            dest.unlink()
    existing = part.stat().st_size if part.exists() else 0
    headers = {}
    if existing and expected_size and existing < expected_size:
        headers["Range"] = f"bytes={existing}-"
    try:
        r = session.get(url, headers=headers, stream=True, timeout=60)
        if r.status_code == 416:
            part.rename(dest)
            return "ok", 0
        r.raise_for_status()
    except Exception as e:
        return f"error: {e}", 0
    mode = "ab" if headers.get("Range") else "wb"
    if mode == "wb":
        existing = 0
    written = 0
    try:
        with open(part, mode) as f:
            for chunk in r.iter_content(chunk_size=CHUNK_SIZE):
                f.write(chunk)
                written += len(chunk)
    except Exception as e:
        return f"error: {e}", written
    final_size = existing + written
    if expected_size and final_size != expected_size:
        return "size_mismatch", written
    part.rename(dest)
    return "ok", written
 # ── Data loading ─────────────────────────────────────────────────────
 def load_video_map():
    with open(VIDEO_MAP_FILE, encoding="utf-8") as f:
        return json.load(f)
 def _is_valid_url(url):
    return url.startswith(
        "http") and "<" not in url and ">" not in url and " href=" not in url
 def collect_urls(video_map):
    urls, seen, skipped = [], set(), 0
    for entry in video_map.values():
        for video_url in entry.get("videos", []):
            if video_url in seen:
                continue
            seen.add(video_url)
            if _is_valid_url(video_url):
                urls.append(video_url)
            else:
                skipped += 1
    if skipped:
        print(f"[!] Skipped {skipped} malformed URL(s)")
    return urls
 def build_url_title_map(video_map):
    url_title = {}
    for entry in video_map.values():
        title = entry.get("title", "")
        for video_url in entry.get("videos", []):
            if video_url not in url_title:
                url_title[video_url] = title
    return url_title
 # ── Main ─────────────────────────────────────────────────────────────
 def main():
    parser = argparse.ArgumentParser(
        description="Download videos from video_map.json")
    parser.add_argument("--output", "-o", default=DEFAULT_OUTPUT,
                        help=f"Download directory (default: {DEFAULT_OUTPUT})")
    naming = parser.add_mutually_exclusive_group()
    naming.add_argument("--titles", "-t", action="store_true",
                        help="Use title-based filenames (saved as default for this directory)")
    naming.add_argument("--original", action="store_true",
                        help="Use original CloudFront filenames (saved as default for this directory)")
    parser.add_argument("--reorganize", action="store_true",
                        help="Rename existing files to match the current naming mode")
    parser.add_argument("--dry-run", "-n", action="store_true",
                        help="Preview without making changes")
    parser.add_argument("--workers", "-w", type=int, default=DEFAULT_WORKERS,
                        help=f"Concurrent downloads (default: {DEFAULT_WORKERS})")
    args = parser.parse_args()
    video_map = load_video_map()
    urls = collect_urls(video_map)
    mode = resolve_mode(args)
    saved = read_mode(args.output)
    mode_changed = saved is not None and saved != mode
    print(f"[+] {len(urls)} MP4 URLs from {VIDEO_MAP_FILE}")
    print(f"[+] Naming mode: {mode}" + (" (changed!)" if mode_changed else ""))
    # Handle reorganize
    if args.reorganize or mode_changed:
        if mode_changed and not args.reorganize:
            print(f"\n[!] Mode changed from '{saved}' to '{mode}'.")
            print(
                "    Use --reorganize to rename existing files, or --dry-run to preview.")
            print("    Refusing to download until existing files are reorganized.")
            return
        reorganize(urls, video_map, args.output, mode, dry_run=args.dry_run)
        if args.dry_run or args.reorganize:
            return
    # Save mode
    if not args.dry_run:
        write_mode(args.output, mode)
    paths = get_paths_for_mode(mode, urls, video_map, args.output)
    clashes = find_clashes(urls)
    if clashes:
        print(
            f"[+] {len(clashes)} filename clash(es) resolved with subfolders/suffixes")
    already = [u for u in urls if paths[u].exists()]
    pending = [u for u in urls if not paths[u].exists()]
    print(f"[+] Already downloaded: {len(already)}")
    print(f"[+] To download: {len(pending)}")
    if not pending:
        print("\n[✓] Everything is already downloaded.")
        return
    if args.dry_run:
        print(
            f"\n[dry-run] Would download {len(pending)} files to {args.output}/")
        for url in pending[:20]:
            print(f"  → {paths[url].name}")
        if len(pending) > 20:
            print(f"  … and {len(pending) - 20} more")
        return
    print("\n[+] Fetching remote file sizes…")
    session = make_session()
    remote_sizes = fetch_sizes(pending, workers=20)
    sized = {u: s for u, s in remote_sizes.items() if s is not None}
    total_bytes = sum(sized.values())
    print(
        f"[+] Download size: {fmt_size(total_bytes)} across {len(pending)} files")
    if already:
        print(f"[+] Verifying {len(already)} existing files…")
        already_sizes = fetch_sizes(already, workers=20)
    mismatched = 0
    for url in already:
        dest = paths[url]
        local = dest.stat().st_size
        remote = already_sizes.get(url)
        if remote and local != remote:
            mismatched += 1
            print(f"[!] Size mismatch: {dest.name} "
                  f"(local {fmt_size(local)} vs remote {fmt_size(remote)})")
            pending.append(url)
            remote_sizes[url] = remote
    if mismatched:
        print(
            f"[!] {mismatched} file(s) will be re-downloaded due to size mismatch")
    print(f"\n[⚡] Downloading with {args.workers} threads…\n")
    completed = 0
    failed = []
    total_written = 0
    total = len(pending)
    interrupted = False
    def do_download(url):
        dest = paths[url]
        expected = remote_sizes.get(url)
        return url, download_one(session, url, dest, expected)
    try:
        with ThreadPoolExecutor(max_workers=args.workers) as pool:
            futures = {pool.submit(do_download, u): u for u in pending}
            for fut in as_completed(futures):
                url, (status, written) = fut.result()
                total_written += written
                completed += 1
                name = paths[url].name
                if status == "ok" and written > 0:
                    print(
                        f"  [{completed}/{total}] ✓ {name} ({fmt_size(written)})")
                elif status == "ok":
                    print(
                        f"  [{completed}/{total}] ✓ {name} (already complete)")
                elif status == "size_mismatch":
                    print(f"  [{completed}/{total}] ⚠ {name} (size mismatch)")
                    failed.append(url)
                else:
                    print(f"  [{completed}/{total}] ✗ {name} ({status})")
                    failed.append(url)
    except KeyboardInterrupt:
        interrupted = True
        pool.shutdown(wait=False, cancel_futures=True)
        print("\n\n[⏸] Interrupted! Partial downloads saved as .part files.")
    print(f"\n{'=' * 50}")
    print(f"  Downloaded: {fmt_size(total_written)}")
    print(f"  Completed:  {completed}/{total}")
    if failed:
        print(f"  Failed:     {len(failed)} (re-run to retry)")
    if interrupted:
        print("  Paused — re-run to resume.")
    elif not failed:
        print("  All done!")
    print(f"{'=' * 50}")
 if __name__ == "__main__":
    main()
@@ -0,0 +1,114 @@
 #!/usr/bin/env python3
 """
 grab_cookie.py — read the WordPress login cookie from an
 installed browser and write it to .env as WP_LOGIN_COOKIE=name=value.
 Usage:
    python grab_cookie.py                        # tries Firefox, Chrome, Edge, Brave
    python grab_cookie.py --browser firefox      # explicit browser
 """
 import argparse
 from pathlib import Path
 from config import COOKIE_DOMAIN
 ENV_FILE = Path(".env")
 ENV_KEY = "WP_LOGIN_COOKIE"
 COOKIE_PREFIX = "wordpress_logged_in_"
 BROWSER_NAMES = ["firefox", "chrome", "edge", "brave"]
 def find_cookie(browser_name):
    """Return (name, value) for the wordpress_logged_in_* cookie, or (None, None)."""
    try:
        import rookiepy
    except ImportError:
        raise ImportError("rookiepy not installed — run: pip install rookiepy")
    fn = getattr(rookiepy, browser_name, None)
    if fn is None:
        raise ValueError(f"rookiepy does not support '{browser_name}'.")
    try:
        cookies = fn([COOKIE_DOMAIN])
    except PermissionError:
        raise PermissionError(
            f"Permission denied reading {browser_name} cookies.\n"
            "    Close the browser, or on Windows run as Administrator for Chrome/Edge."
        )
    except Exception as e:
        raise RuntimeError(f"Could not read {browser_name} cookies: {e}")
    for c in cookies:
        if c.get("name", "").startswith(COOKIE_PREFIX):
            return c["name"], c["value"]
    return None, None
 def update_env(name, value):
    """Write WP_LOGIN_COOKIE=name=value into .env, replacing any existing line."""
    new_line = f"{ENV_KEY}={name}={value}\n"
    if ENV_FILE.exists():
        text = ENV_FILE.read_text(encoding="utf-8")
        lines = text.splitlines(keepends=True)
        for i, line in enumerate(lines):
            if line.startswith(f"{ENV_KEY}=") or line.strip() == ENV_KEY:
                lines[i] = new_line
                ENV_FILE.write_text("".join(lines), encoding="utf-8")
                return "updated"
        # Key not present — append
        if text and not text.endswith("\n"):
            text += "\n"
        ENV_FILE.write_text(text + new_line, encoding="utf-8")
        return "appended"
    else:
        ENV_FILE.write_text(new_line, encoding="utf-8")
        return "created"
 def main():
    parser = argparse.ArgumentParser(
        description=f"Copy the {COOKIE_DOMAIN} login cookie from your browser into .env."
    )
    parser.add_argument(
        "--browser", "-b",
        choices=BROWSER_NAMES,
        metavar="BROWSER",
        help=f"Browser to read from: {', '.join(BROWSER_NAMES)} (default: try all in order)",
    )
    args = parser.parse_args()
    order = [args.browser] if args.browser else BROWSER_NAMES
    cookie_name = cookie_value = None
    for browser in order:
        print(f"[…] Trying {browser}…")
        try:
            cookie_name, cookie_value = find_cookie(browser)
        except ImportError as e:
            raise SystemExit(f"[!] {e}")
        except (ValueError, PermissionError, RuntimeError) as e:
            print(f"[!] {e}")
            continue
        if cookie_name:
            print(f"[+] Found in {browser}: {cookie_name}")
            break
        print(f"    No {COOKIE_PREFIX}* cookie found in {browser}.")
    if not cookie_name:
        raise SystemExit(
            f"\n[!] No {COOKIE_PREFIX}* cookie found in any browser.\n"
            f"    Make sure you are logged into {COOKIE_DOMAIN}, then re-run.\n"
            "    Or set WP_LOGIN_COOKIE manually in .env — see .env.example."
        )
    action = update_env(cookie_name, cookie_value)
    print(f"[✓] {ENV_KEY} {action} in {ENV_FILE}.")
 if __name__ == "__main__":
    main()
@@ -0,0 +1,467 @@
 import re
 import json
 import os
 import time
 import signal
 import asyncio
 import tempfile
 import requests
 from pathlib import Path, PurePosixPath
 from urllib.parse import urlparse
 from dotenv import load_dotenv
 from playwright.async_api import async_playwright
 from check_clashes import VIDEO_EXTS
 from config import BASE_URL
 load_dotenv()
 def _is_video_url(url):
    """True if `url` ends with a recognised video extension (case-insensitive, path only)."""
    return PurePosixPath(urlparse(url).path).suffix.lower() in VIDEO_EXTS
 WP_API = f"{BASE_URL}/wp-json/wp/v2"
 SKIP_TYPES = {
    "attachment", "nav_menu_item", "wp_block", "wp_template",
    "wp_template_part", "wp_global_styles", "wp_navigation",
    "wp_font_family", "wp_font_face",
 }
 VIDEO_MAP_FILE = "video_map.json"
 MAX_WORKERS = 4
 API_HEADERS = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:147.0) Gecko/20100101 Firefox/147.0",
    "Accept": "application/json",
    "Referer": f"{BASE_URL}/",
 }
 def _get_login_cookie():
    raw = os.environ.get("WP_LOGIN_COOKIE", "").strip()  # strip accidental whitespace
    if not raw:
        raise RuntimeError(
            "WP_LOGIN_COOKIE not set. Copy it from your browser into .env — see .env.example.")
    name, _, value = raw.partition("=")
    if not value:
        raise RuntimeError(
            "WP_LOGIN_COOKIE looks malformed (no '=' found). Expected: name=value")
    if not name.startswith("wordpress_logged_in_"):
        raise RuntimeError(
            "WP_LOGIN_COOKIE doesn't look right — expected a wordpress_logged_in_... cookie.")
    return name, value
 def discover_content_types(session):
    """Query /wp-json/wp/v2/types and return a list of (name, rest_base, type_slug) for content types worth scraping."""
    r = session.get(f"{WP_API}/types", timeout=30)
    r.raise_for_status()
    types = r.json()
    targets = []
    for type_slug, info in types.items():
        if type_slug in SKIP_TYPES:
            continue
        rest_base = info.get("rest_base")
        name = info.get("name", type_slug)
        if rest_base:
            targets.append((name, rest_base, type_slug))
    return targets
 def fetch_all_posts_for_type(session, type_name, rest_base, type_slug):
    """Paginate one content type and return (url, title, description) tuples.
    Uses the `link` field when available; falls back to building from slug."""
    url_prefix = type_slug.replace("_", "-")
    results = []
    page = 1
    while True:
        r = session.get(
            f"{WP_API}/{rest_base}",
            params={"per_page": 100, "page": page},
            timeout=30,
        )
        if r.status_code == 400 or not r.ok:
            break
        data = r.json()
        if not data:
            break
        for post in data:
            link = post.get("link", "")
            if not link.startswith("http"):
                slug = post.get("slug")
                if slug:
                    link = f"{BASE_URL}/{url_prefix}/{slug}/"
                else:
                    continue
            title_obj = post.get("title", {})
            title = title_obj.get("rendered", "") if isinstance(
                title_obj, dict) else str(title_obj)
            content_obj = post.get("content", {})
            content_html = content_obj.get(
                "rendered", "") if isinstance(content_obj, dict) else ""
            description = html_to_text(content_html) if content_html else ""
            results.append((link, title, description))
        print(f"    {type_name} page {page}: {len(data)} items")
        page += 1
    return results
 def fetch_post_urls_from_api(headers):
    """Auto-discover all content types via the WP REST API and collect every post URL.
    Also builds video_map.json with titles pre-populated."""
    print("[+] video_map.json empty or missing — discovering content types from REST API…")
    session = requests.Session()
    session.headers.update(headers)
    targets = discover_content_types(session)
    print(
        f"[+] Found {len(targets)} content types: {', '.join(name for name, _, _ in targets)}\n")
    all_results = []
    for type_name, rest_base, type_slug in targets:
        type_results = fetch_all_posts_for_type(
            session, type_name, rest_base, type_slug)
        all_results.extend(type_results)
    seen = set()
    deduped_urls = []
    video_map = load_video_map()
    for url, title, description in all_results:
        if url not in seen and url.startswith("http"):
            seen.add(url)
            deduped_urls.append(url)
            if url not in video_map:
                video_map[url] = {"title": title,
                                  "description": description, "videos": []}
            else:
                if not video_map[url].get("title"):
                    video_map[url]["title"] = title
                if not video_map[url].get("description"):
                    video_map[url]["description"] = description
    save_video_map(video_map)
    print(
        f"\n[+] Discovered {len(deduped_urls)} unique URLs → saved to {VIDEO_MAP_FILE}")
    print(
        f"[+] Pre-populated {len(video_map)} entries in {VIDEO_MAP_FILE}")
    return deduped_urls
 def fetch_metadata_from_api(video_map, urls, headers):
    """Populate missing titles and descriptions in video_map from the REST API."""
    missing = [u for u in urls
               if u not in video_map
               or not video_map[u].get("title")
               or not video_map[u].get("description")]
    if not missing:
        return
    print(f"[+] Fetching metadata from REST API for {len(missing)} posts…")
    session = requests.Session()
    session.headers.update(headers)
    targets = discover_content_types(session)
    for type_name, rest_base, type_slug in targets:
        type_results = fetch_all_posts_for_type(
            session, type_name, rest_base, type_slug)
        for url, title, description in type_results:
            if url in video_map:
                if not video_map[url].get("title"):
                    video_map[url]["title"] = title
                if not video_map[url].get("description"):
                    video_map[url]["description"] = description
            else:
                video_map[url] = {"title": title,
                                  "description": description, "videos": []}
    save_video_map(video_map)
    populated_t = sum(1 for u in urls if video_map.get(u, {}).get("title"))
    populated_d = sum(1 for u in urls if video_map.get(
        u, {}).get("description"))
    print(f"[+] Titles populated: {populated_t}/{len(urls)}")
    print(f"[+] Descriptions populated: {populated_d}/{len(urls)}")
 def load_post_urls(headers):
    vm = load_video_map()
    if vm:
        print(f"[+] {VIDEO_MAP_FILE} found — loading {len(vm)} post URLs.")
        return list(vm.keys())
    return fetch_post_urls_from_api(headers)
 def html_to_text(html_str):
    """Strip HTML tags, decode entities, and collapse whitespace into clean plain text."""
    import html
    text = re.sub(r'<br\s*/?>', '\n', html_str)
    text = text.replace('</p>', '\n\n')
    text = re.sub(r'<[^>]+>', '', text)
    text = html.unescape(text)
    lines = [line.strip() for line in text.splitlines()]
    text = '\n'.join(lines)
    text = re.sub(r'\n{3,}', '\n\n', text)
    return text.strip()
 def extract_mp4_from_html(html):
    candidates = re.findall(r'https?://[^\s"\'<>]+', html)
    return [u for u in candidates if _is_video_url(u)]
 def extract_title_from_html(html):
    m = re.search(
        r'<h1[^>]*class="entry-title"[^>]*>(.*?)</h1>', html, re.DOTALL)
    if m:
        title = re.sub(r'<[^>]+>', '', m.group(1)).strip()
        return title
    m = re.search(r'<title>(.*?)(?:\s*[-–|].*)?</title>', html, re.DOTALL)
    if m:
        return m.group(1).strip()
    return None
 def load_video_map():
    if Path(VIDEO_MAP_FILE).exists():
        try:
            with open(VIDEO_MAP_FILE, encoding="utf-8") as f:
                return json.load(f)
        except (json.JSONDecodeError, OSError):
            return {}
    return {}
 def save_video_map(video_map):
    fd, tmp_path = tempfile.mkstemp(dir=Path(VIDEO_MAP_FILE).resolve().parent, suffix=".tmp")
    try:
        with os.fdopen(fd, "w", encoding="utf-8") as f:
            json.dump(video_map, f, indent=2, ensure_ascii=False)
        Path(tmp_path).replace(VIDEO_MAP_FILE)
    except Exception:
        try:
            Path(tmp_path).unlink()
        except OSError:
            pass
        raise
 def _expects_video(url):
    return "/pinkcuffs-videos/" in url
 MAX_RETRIES = 2
 async def worker(worker_id, queue, context, known,
                 total, retry_counts, video_map, map_lock, shutdown_event):
    page = await context.new_page()
    video_hits = set()
    page.on("response", lambda resp: video_hits.add(resp.url) if _is_video_url(resp.url) else None)
    try:
        while not shutdown_event.is_set():
            try:
                idx, url = queue.get_nowait()
            except asyncio.QueueEmpty:
                break
            attempt = retry_counts.get(idx, 0)
            label = f" (retry {attempt}/{MAX_RETRIES})" if attempt else ""
            print(f"[W{worker_id}] ({idx + 1}/{total}) {url}{label}")
            try:
                await page.goto(url, wait_until="networkidle", timeout=60000)
            except Exception as e:
                print(f"[W{worker_id}] Navigation error: {e}")
                if _expects_video(url) and attempt < MAX_RETRIES:
                    retry_counts[idx] = attempt + 1
                    queue.put_nowait((idx, url))
                    print(f"[W{worker_id}] Re-queued for retry.")
                elif not _expects_video(url):
                    async with map_lock:
                        entry = video_map.get(url, {})
                        entry["scraped_at"] = int(time.time())
                        video_map[url] = entry
                        save_video_map(video_map)
                else:
                    print(
                        f"[W{worker_id}] Still failing after {MAX_RETRIES} retries — will retry next run.")
                continue
            await asyncio.sleep(1.5)
            html = await page.content()
            title = extract_title_from_html(html)
            html_videos = extract_mp4_from_html(html)
            found = set(html_videos) | set(video_hits)
            video_hits.clear()
            all_videos = [m for m in found if m not in (
                f"{BASE_URL}/wp-content/plugins/easy-video-player/lib/blank.mp4",
            )]
            async with map_lock:
                new_found = found - known
                if new_found:
                    print(f"[W{worker_id}] Found {len(new_found)} new video URLs")
                    known.update(new_found)
                elif all_videos:
                    print(
                        f"[W{worker_id}] {len(all_videos)} video(s) already known — skipping write.")
                else:
                    print(f"[W{worker_id}] No video found on page.")
                entry = video_map.get(url, {})
                if title:
                    entry["title"] = title
                existing_videos = set(entry.get("videos", []))
                existing_videos.update(all_videos)
                entry["videos"] = sorted(existing_videos)
                mark_done = bool(all_videos) or not _expects_video(url)
                if mark_done:
                    entry["scraped_at"] = int(time.time())
                video_map[url] = entry
                save_video_map(video_map)
            if not mark_done:
                if attempt < MAX_RETRIES:
                    retry_counts[idx] = attempt + 1
                    queue.put_nowait((idx, url))
                    print(
                        f"[W{worker_id}] Re-queued for retry ({attempt + 1}/{MAX_RETRIES}).")
                else:
                    print(
                        f"[W{worker_id}] No video after {MAX_RETRIES} retries — will retry next run.")
    finally:
        await page.close()
 async def run():
    shutdown_event = asyncio.Event()
    loop = asyncio.get_running_loop()
    def _handle_shutdown(signum, _frame):
        print(f"\n[!] Signal {signum} received — finishing active pages then exiting…")
        loop.call_soon_threadsafe(shutdown_event.set)
    signal.signal(signal.SIGINT, _handle_shutdown)
    signal.signal(signal.SIGTERM, _handle_shutdown)
    try:
        cookie_name, cookie_value = _get_login_cookie()
        req_headers = {
            **API_HEADERS,
            "Cookie": f"{cookie_name}={cookie_value}; eav-age-verified=1",
        }
        urls = load_post_urls(req_headers)
        video_map = load_video_map()
        if any(u not in video_map
               or not video_map[u].get("title")
               or not video_map[u].get("description")
               for u in urls if _expects_video(u)):
            fetch_metadata_from_api(video_map, urls, req_headers)
        known = {u for entry in video_map.values() for u in entry.get("videos", [])}
        total = len(urls)
        pending = []
        needs_map = 0
        for i, u in enumerate(urls):
            entry = video_map.get(u, {})
            if not entry.get("scraped_at"):
                pending.append((i, u))
            elif _expects_video(u) and not entry.get("videos"):
                pending.append((i, u))
                needs_map += 1
        done_count = sum(1 for v in video_map.values() if v.get("scraped_at"))
        print(f"[+] Loaded {total} post URLs.")
        print(f"[+] Already have {len(known)} video URLs mapped.")
        print(f"[+] Video map: {len(video_map)} entries in {VIDEO_MAP_FILE}")
        if done_count:
            remaining_new = len(pending) - needs_map
            print(
                f"[↻] Resuming: {done_count} done, {remaining_new} new + {needs_map} needing map data.")
        if not pending:
            print("[✓] All URLs already processed and mapped.")
            return
        print(
            f"[⚡] Running with {min(MAX_WORKERS, len(pending))} concurrent workers.\n")
        queue = asyncio.Queue()
        for item in pending:
            queue.put_nowait(item)
        map_lock = asyncio.Lock()
        retry_counts = {}
        async with async_playwright() as p:
            browser = await p.firefox.launch(headless=True)
            context = await browser.new_context()
            _cookie_domain = urlparse(BASE_URL).netloc
            site_cookies = [
                {
                    "name": cookie_name,
                    "value": cookie_value,
                    "domain": _cookie_domain,
                    "path": "/",
                    "httpOnly": True,
                    "secure": True,
                    "sameSite": "None"
                },
                {
                    "name": "eav-age-verified",
                    "value": "1",
                    "domain": _cookie_domain,
                    "path": "/"
                }
            ]
            await context.add_cookies(site_cookies)
            num_workers = min(MAX_WORKERS, len(pending))
            workers = [
                asyncio.create_task(
                    worker(i, queue, context, known,
                           total, retry_counts, video_map, map_lock, shutdown_event)
                )
                for i in range(num_workers)
            ]
            await asyncio.gather(*workers)
            await browser.close()
        mapped = sum(1 for v in video_map.values() if v.get("videos"))
        print(
            f"\n[+] Video map: {mapped} posts with videos, {len(video_map)} total entries.")
        if not shutdown_event.is_set():
            print(f"[✓] Completed. Full map in {VIDEO_MAP_FILE}")
        else:
            done = sum(1 for v in video_map.values() if v.get("scraped_at"))
            print(f"[⏸] Paused — {done}/{total} done. Run again to resume.")
    finally:
        signal.signal(signal.SIGINT, signal.SIG_DFL)
        signal.signal(signal.SIGTERM, signal.SIG_DFL)
 def main():
    try:
        asyncio.run(run())
    except KeyboardInterrupt:
        print("\n[!] Interrupted. Run again to resume.")
    except RuntimeError as e:
        raise SystemExit(f"[!] {e}")
 if __name__ == "__main__":
    main()
@@ -0,0 +1,4 @@
 playwright==1.58.0
 python-dotenv==1.2.1
 Requests==2.32.5
 rookiepy==0.5.6
@@ -0,0 +1,61 @@
 """Calculate total disk space needed to download all videos.
 Importable function:
    summarize_sizes(sizes) - return dict with total, smallest, largest, average, failed
 """
 from check_clashes import fmt_size, fetch_sizes, load_video_map, VIDEO_MAP_FILE
 def summarize_sizes(sizes):
    """Given {url: size_or_None}, return a stats dict."""
    known = {u: s for u, s in sizes.items() if s is not None}
    failed = [u for u, s in sizes.items() if s is None]
    if not known:
        return {"sized": 0, "total": len(sizes), "total_bytes": 0,
                "smallest": 0, "largest": 0, "average": 0, "failed": failed}
    total_bytes = sum(known.values())
    return {
        "sized": len(known),
        "total": len(sizes),
        "total_bytes": total_bytes,
        "smallest": min(known.values()),
        "largest": max(known.values()),
        "average": total_bytes // len(known),
        "failed": failed,
    }
 # --------------- CLI ---------------
 def _progress(done, total):
    if done % 200 == 0 or done == total:
        print(f"    {done}/{total}")
 def main():
    vm = load_video_map()
    urls = [u for entry in vm.values() for u in entry.get("videos", []) if u.startswith("http")]
    print(f"[+] {len(urls)} URLs in {VIDEO_MAP_FILE}")
    print("[+] Fetching file sizes (20 threads)…\n")
    sizes = fetch_sizes(urls, workers=20, on_progress=_progress)
    stats = summarize_sizes(sizes)
    print(f"\n{'=' * 45}")
    print(f"  Sized:    {stats['sized']}/{stats['total']} files")
    print(f"  Total:    {fmt_size(stats['total_bytes'])}")
    print(f"  Smallest: {fmt_size(stats['smallest'])}")
    print(f"  Largest:  {fmt_size(stats['largest'])}")
    print(f"  Average:  {fmt_size(stats['average'])}")
    print(f"{'=' * 45}")
    if stats["failed"]:
        print(f"\n[!] {len(stats['failed'])} URL(s) could not be sized:")
        for u in stats["failed"]:
            print(f"    {u}")
 if __name__ == "__main__":
    main()
@@ -0,0 +1,603 @@
 """Upload videos to PeerTube with transcoding-aware flow control.
 Uploads videos one batch at a time, waits for each batch to be fully transcoded
 and moved to object storage before uploading the next — preventing disk
 exhaustion on the PeerTube server.
 Usage:
    python upload.py                      # upload from ./downloads
    python upload.py -i /mnt/vol/dl      # custom input dir
    python upload.py --batch-size 2      # upload 2, wait, repeat
    python upload.py --dry-run           # preview without uploading
    python upload.py --skip-wait         # upload without waiting
 Required (CLI flag or env var):
    --url / PEERTUBE_URL
    --username / PEERTUBE_USER
    --channel / PEERTUBE_CHANNEL
    --password / PEERTUBE_PASSWORD
 """
 import argparse
 from collections import Counter
 import html
 import os
 from pathlib import Path
 import re
 import sys
 import time
 import requests
 from dotenv import load_dotenv
 from check_clashes import fmt_size, url_to_filename, VIDEO_EXTS
 from download import (
    load_video_map,
    collect_urls,
    get_paths_for_mode,
    read_mode,
    MODE_ORIGINAL,
    DEFAULT_OUTPUT,
 )
 load_dotenv()
 # ── Defaults ─────────────────────────────────────────────────────────
 DEFAULT_BATCH_SIZE = 1
 DEFAULT_POLL = 30
 UPLOADED_FILE = ".uploaded"
 PT_NAME_MAX = 120
 # ── Text helpers ─────────────────────────────────────────────────────
 def clean_description(raw):
    """Strip WordPress shortcodes and HTML from a description."""
    if not raw:
        return ""
    text = re.sub(r'\[/?[^\]]+\]', '', raw)
    text = re.sub(r'<[^>]+>', '', text)
    text = html.unescape(text)
    text = re.sub(r'\n{3,}', '\n\n', text).strip()
    return text[:10000]
 def make_pt_name(title, fallback_filename):
    """Build a PeerTube-safe video name (3-120 chars)."""
    name = html.unescape(title).strip(
    ) if title else Path(fallback_filename).stem
    if len(name) > PT_NAME_MAX:
        name = name[: PT_NAME_MAX - 1].rstrip() + "\u2026"
    while len(name) < 3:
        name += "_"
    return name
 # ── PeerTube API ─────────────────────────────────────────────────────
 def get_oauth_token(base, username, password):
    r = requests.get(f"{base}/api/v1/oauth-clients/local", timeout=15)
    r.raise_for_status()
    client = r.json()
    r = requests.post(
        f"{base}/api/v1/users/token",
        data={
            "client_id": client["client_id"],
            "client_secret": client["client_secret"],
            "grant_type": "password",
            "username": username,
            "password": password,
        },
        timeout=15,
    )
    r.raise_for_status()
    return r.json()["access_token"]
 def api_headers(token):
    return {"Authorization": f"Bearer {token}"}
 def get_channel_id(base, token, channel_name):
    r = requests.get(
        f"{base}/api/v1/video-channels/{channel_name}",
        headers=api_headers(token),
        timeout=15,
    )
    r.raise_for_status()
    return r.json()["id"]
 def get_channel_video_names(base, token, channel_name):
    """Paginate through the channel and return a Counter of video names."""
    counts = Counter()
    start = 0
    while True:
        r = requests.get(
            f"{base}/api/v1/video-channels/{channel_name}/videos",
            params={"start": start, "count": 100},
            headers=api_headers(token),
            timeout=30,
        )
        r.raise_for_status()
        data = r.json()
        for v in data.get("data", []):
            counts[v["name"]] += 1
        start += 100
        if start >= data.get("total", 0):
            break
    return counts
 CHUNK_SIZE = 10 * 1024 * 1024  # 10 MB
 MAX_RETRIES = 5
 def _init_resumable(base, token, channel_id, filepath, filename, name,
                    description="", nsfw=False):
    """POST to create a resumable upload session.  Returns upload URL."""
    file_size = Path(filepath).stat().st_size
    metadata = {
        "name": name,
        "channelId": channel_id,
        "filename": filename,
        "nsfw": nsfw,
        "waitTranscoding": True,
        "privacy": 1,
    }
    if description:
        metadata["description"] = description
    r = requests.post(
        f"{base}/api/v1/videos/upload-resumable",
        headers={
            **api_headers(token),
            "Content-Type": "application/json",
            "X-Upload-Content-Length": str(file_size),
            "X-Upload-Content-Type": "video/mp4",
        },
        json=metadata,
        timeout=30,
    )
    r.raise_for_status()
    location = r.headers["Location"]
    if location.startswith("//"):
        location = "https:" + location
    elif location.startswith("/"):
        location = base + location
    return location, file_size
 def _query_offset(upload_url, token, file_size):
    """Ask the server how many bytes it has received so far."""
    r = requests.put(
        upload_url,
        headers={
            **api_headers(token),
            "Content-Range": f"bytes */{file_size}",
            "Content-Length": "0",
        },
        timeout=15,
    )
    if r.status_code == 308:
        range_hdr = r.headers.get("Range", "")
        if range_hdr:
            return int(range_hdr.split("-")[1]) + 1
        return 0
    if r.status_code == 200:
        return file_size
    r.raise_for_status()
    return 0
 def upload_video(base, token, channel_id, filepath, name,
                 description="", nsfw=False):
    """Resumable chunked upload.  Returns (ok, uuid)."""
    filepath = Path(filepath)
    filename = filepath.name
    file_size = filepath.stat().st_size
    try:
        upload_url, _ = _init_resumable(
            base, token, channel_id, filepath, filename,
            name, description, nsfw,
        )
    except Exception as e:
        print(f"    Init failed: {e}")
        return False, None
    offset = 0
    retries = 0
    with open(filepath, "rb") as f:
        while offset < file_size:
            end = min(offset + CHUNK_SIZE, file_size) - 1
            chunk_len = end - offset + 1
            f.seek(offset)
            chunk = f.read(chunk_len)
            pct = int(100 * (end + 1) / file_size)
            print(f"    {fmt_size(offset)}/{fmt_size(file_size)}  ({pct}%)",
                  end="\r", flush=True)
            try:
                r = requests.put(
                    upload_url,
                    headers={
                        **api_headers(token),
                        "Content-Type": "application/octet-stream",
                        "Content-Range": f"bytes {offset}-{end}/{file_size}",
                        "Content-Length": str(chunk_len),
                    },
                    data=chunk,
                    timeout=120,
                )
            except (requests.ConnectionError, requests.Timeout) as e:
                retries += 1
                if retries > MAX_RETRIES:
                    print(
                        f"\n    Upload failed after {MAX_RETRIES} retries: {e}")
                    return False, None
                wait = min(2 ** retries, 60)
                print(f"\n    Connection error, retry {retries}/{MAX_RETRIES} "
                      f"in {wait}s ...")
                time.sleep(wait)
                try:
                    offset = _query_offset(upload_url, token, file_size)
                except Exception:
                    pass
                continue
            if r.status_code == 308:
                range_hdr = r.headers.get("Range", "")
                if range_hdr:
                    offset = int(range_hdr.split("-")[1]) + 1
                else:
                    offset = end + 1
                retries = 0
            elif r.status_code == 200:
                print(
                    f"    {fmt_size(file_size)}/{fmt_size(file_size)}  (100%)")
                uuid = r.json().get("video", {}).get("uuid")
                return True, uuid
            elif r.status_code in (502, 503, 429):
                retry_after = int(r.headers.get("Retry-After", 10))
                retries += 1
                if retries > MAX_RETRIES:
                    print(
                        f"\n    Upload failed: server returned {r.status_code}")
                    return False, None
                print(
                    f"\n    Server {r.status_code}, retry in {retry_after}s ...")
                time.sleep(retry_after)
                try:
                    offset = _query_offset(upload_url, token, file_size)
                except Exception:
                    pass
            else:
                detail = r.text[:300] if r.text else str(r.status_code)
                print(f"\n    Upload failed ({r.status_code}): {detail}")
                return False, None
    print("\n    Unexpected: sent all bytes but no 200 response")
    return False, None
 _STATE = {
    1: "Published",
    2: "To transcode",
    3: "To import",
    6: "Moving to object storage",
    7: "Transcoding failed",
    8: "Storage move failed",
    9: "To edit",
 }
 def get_video_state(base, token, uuid):
    r = requests.get(
        f"{base}/api/v1/videos/{uuid}",
        headers=api_headers(token),
        timeout=15,
    )
    r.raise_for_status()
    state = r.json()["state"]
    return state["id"], state.get("label", "")
 def wait_for_published(base, token, uuid, poll_interval):
    """Block until the video reaches state 1 (Published) or a failure state."""
    started = time.monotonic()
    while True:
        elapsed = int(time.monotonic() - started)
        hours, rem = divmod(elapsed, 3600)
        mins, secs = divmod(rem, 60)
        if hours:
            elapsed_str = f"{hours}h {mins:02d}m {secs:02d}s"
        elif mins:
            elapsed_str = f"{mins}m {secs:02d}s"
        else:
            elapsed_str = f"{secs}s"
        try:
            sid, label = get_video_state(base, token, uuid)
        except requests.exceptions.RequestException as e:
            print(f"    -> Poll error ({e.__class__.__name__}) "
                  f"after {elapsed_str}, retrying in {poll_interval}s …")
            time.sleep(poll_interval)
            continue
        display = _STATE.get(sid, label or f"state {sid}")
        if sid == 1:
            print(f"    -> {display}")
            return sid
        if sid in (7, 8):
            print(f"    -> FAILED: {display}")
            return sid
        print(f"    -> {display} … {elapsed_str} elapsed (next check in {poll_interval}s)")
        time.sleep(poll_interval)
 # ── State tracker ────────────────────────────────────────────────────
 def load_uploaded(input_dir):
    path = Path(input_dir) / UPLOADED_FILE
    if not path.exists():
        return set()
    with open(path) as f:
        return {Path(line.strip()) for line in f if line.strip()}
 def mark_uploaded(input_dir, rel_path):
    with open(Path(input_dir) / UPLOADED_FILE, "a") as f:
        f.write(f"{rel_path}\n")
 # ── File / metadata helpers ─────────────────────────────────────────
 def build_path_to_meta(video_map, input_dir):
    """Map each expected download path (relative) to {title, description}."""
    urls = collect_urls(video_map)
    mode = read_mode(input_dir) or MODE_ORIGINAL
    paths = get_paths_for_mode(mode, urls, video_map, input_dir)
    url_meta = {}
    for entry in video_map.values():
        t = entry.get("title", "")
        d = entry.get("description", "")
        for video_url in entry.get("videos", []):
            if video_url not in url_meta:
                url_meta[video_url] = {"title": t, "description": d}
    result = {}
    for url, abs_path in paths.items():
        rel = Path(abs_path).relative_to(input_dir)
        meta = url_meta.get(url, {"title": "", "description": ""})
        result[rel] = {**meta, "original_filename": url_to_filename(url)}
    return result
 def find_videos(input_dir):
    """Walk input_dir and return a set of relative paths for all video files."""
    found = set()
    for root, dirs, files in os.walk(input_dir):
        dirs[:] = [d for d in dirs if not d.startswith(".")]
        for f in files:
            if Path(f).suffix.lower() in VIDEO_EXTS:
                found.add((Path(root) / f).relative_to(input_dir))
    return found
 # ── Channel match helpers ─────────────────────────────────────────────
 def _channel_match(rel, path_meta, existing):
    """Return (matched, name) for a local file against the channel name set.
    Checks both the title-derived name and the original-filename-derived name
    so that videos uploaded under either form are recognised.  Extracted to
    avoid duplicating the logic between the pre-reconcile sweep and the per-
    file check inside the upload loop.
    """
    meta = path_meta.get(rel, {})
    name = make_pt_name(meta.get("title", ""), rel.name)
    orig_fn = meta.get("original_filename", "")
    raw_name = make_pt_name("", orig_fn) if orig_fn else None
    matched = name in existing or (raw_name and raw_name != name and raw_name in existing)
    return matched, name
 # ── CLI ──────────────────────────────────────────────────────────────
 def main():
    ap = argparse.ArgumentParser(
        description="Upload videos to PeerTube with transcoding-aware batching",
    )
    ap.add_argument("--input", "-i", default=DEFAULT_OUTPUT,
                    help=f"Directory with downloaded videos (default: {DEFAULT_OUTPUT})")
    ap.add_argument("--url",
                    help="PeerTube instance URL (or set PEERTUBE_URL env var)")
    ap.add_argument("--username", "-U",
                    help="PeerTube username (or set PEERTUBE_USER env var)")
    ap.add_argument("--password", "-p",
                    help="PeerTube password (or set PEERTUBE_PASSWORD env var)")
    ap.add_argument("--channel", "-C",
                    help="Channel to upload to (or set PEERTUBE_CHANNEL env var)")
    ap.add_argument("--batch-size", "-b", type=int, default=DEFAULT_BATCH_SIZE,
                    help="Videos to upload before waiting for transcoding (default: 1)")
    ap.add_argument("--poll-interval", type=int, default=DEFAULT_POLL,
                    help=f"Seconds between state polls (default: {DEFAULT_POLL})")
    ap.add_argument("--skip-wait", action="store_true",
                    help="Upload everything without waiting for transcoding")
    ap.add_argument("--nsfw", action="store_true",
                    help="Mark videos as NSFW")
    ap.add_argument("--dry-run", "-n", action="store_true",
                    help="Preview what would be uploaded")
    args = ap.parse_args()
    url      = args.url      or os.environ.get("PEERTUBE_URL")
    username = args.username or os.environ.get("PEERTUBE_USER")
    channel  = args.channel  or os.environ.get("PEERTUBE_CHANNEL")
    password = args.password or os.environ.get("PEERTUBE_PASSWORD")
    if not args.dry_run:
        missing = [label for label, val in [
            ("--url / PEERTUBE_URL", url),
            ("--username / PEERTUBE_USER", username),
            ("--channel / PEERTUBE_CHANNEL", channel),
            ("--password / PEERTUBE_PASSWORD", password),
        ] if not val]
        if missing:
            for label in missing:
                print(f"[!] Required: {label}")
            sys.exit(1)
    # ── load metadata & scan disk ──
    video_map = load_video_map()
    path_meta = build_path_to_meta(video_map, args.input)
    on_disk = find_videos(args.input)
    unmatched = on_disk - set(path_meta.keys())
    if unmatched:
        print(
            f"[!] {len(unmatched)} file(s) on disk not in video_map (will use filename as title)")
        for rel in unmatched:
            path_meta[rel] = {"title": "", "description": ""}
    uploaded = load_uploaded(args.input)
    pending = sorted(rel for rel in on_disk if rel not in uploaded)
    print(f"[+] {len(on_disk)} video files in {args.input}/")
    print(f"[+] {len(uploaded)} already uploaded")
    print(f"[+] {len(pending)} pending")
    print(f"[+] Batch size: {args.batch_size}")
    if not pending:
        print("\nAll videos already uploaded.")
        return
    # ── dry run ──
    if args.dry_run:
        total_bytes = 0
        for rel in pending:
            meta = path_meta.get(rel, {})
            name = make_pt_name(meta.get("title", ""), rel.name)
            sz = (Path(args.input) / rel).stat().st_size
            total_bytes += sz
            print(f"  [{fmt_size(sz):>10}]  {name}")
        print(
            f"\n  Total: {fmt_size(total_bytes)} across {len(pending)} videos")
        return
    # ── authenticate ──
    base = url.rstrip("/")
    if not base.startswith("http"):
        base = "https://" + base
    print(f"\n[+] Authenticating with {base} ...")
    token = get_oauth_token(base, username, password)
    print(f"[+] Authenticated as {username}")
    channel_id = get_channel_id(base, token, channel)
    print(f"[+] Channel: {channel} (id {channel_id})")
    name_counts = get_channel_video_names(base, token, channel)
    existing = set(name_counts)
    total = sum(name_counts.values())
    print(f"[+] Found {total} video(s) on channel ({len(name_counts)} unique name(s))")
    dupes = {name: count for name, count in name_counts.items() if count > 1}
    if dupes:
        print(f"[!] {len(dupes)} duplicate name(s) detected on channel:")
        for name, count in sorted(dupes.items()):
            print(f"    x{count}  {name}")
    # ── pre-reconcile: sweep all pending against channel names ────────
    # The main upload loop discovers already-uploaded videos lazily as it
    # walks the sorted pending list — meaning on a fresh run (no .uploaded
    # file) you won't know how many files are genuinely new until the loop
    # has processed everything.  Doing a full sweep here, before any
    # upload starts, gives an accurate count up-front and pre-populates
    # .uploaded so that interrupted/re-run sessions skip them instantly
    # without re-checking each time.
    pre_matched = []
    for rel in pending:
        if _channel_match(rel, path_meta, existing)[0]:
            pre_matched.append(rel)
    if pre_matched:
        print(f"\n[+] Pre-sweep: {len(pre_matched)} local file(s) already on channel — marking uploaded")
        for rel in pre_matched:
            mark_uploaded(args.input, rel)
        pending = [rel for rel in pending if rel not in set(pre_matched)]
        print(f"[+] {len(pending)} left to upload\n")
    nsfw = args.nsfw
    total_up = 0
    batch: list[tuple[str, str]] = []   # [(uuid, name), ...]
    try:
        for rel in pending:
            # ── flush batch if full ──
            if not args.skip_wait and len(batch) >= args.batch_size:
                print(
                    f"\n[+] Waiting for {len(batch)} video(s) to finish processing ...")
                for uuid, bname in batch:
                    print(f"\n  [{bname}]")
                    wait_for_published(base, token, uuid, args.poll_interval)
                batch.clear()
            filepath = Path(args.input) / rel
            meta = path_meta.get(rel, {})
            name = make_pt_name(meta.get("title", ""), rel.name)
            desc = clean_description(meta.get("description", ""))
            sz = filepath.stat().st_size
            if _channel_match(rel, path_meta, existing)[0]:
                print(f"\n[skip] already on channel: {name}")
                mark_uploaded(args.input, rel)
                continue
            print(f"\n[{total_up + 1}/{len(pending)}] {name}")
            print(f"    File: {rel}  ({fmt_size(sz)})")
            ok, uuid = upload_video(
                base, token, channel_id, filepath, name, desc, nsfw)
            if not ok:
                continue
            print(f"    Uploaded  uuid={uuid}")
            mark_uploaded(args.input, rel)
            total_up += 1
            existing.add(name)
            if uuid:
                batch.append((uuid, name))
        # ── wait for final batch ──
        if batch and not args.skip_wait:
            print(f"\n[+] Waiting for final {len(batch)} video(s) ...")
            for uuid, bname in batch:
                print(f"\n  [{bname}]")
                wait_for_published(base, token, uuid, args.poll_interval)
    except KeyboardInterrupt:
        print(
            f"\n\n[!] Interrupted after {total_up} uploads. Re-run to continue.")
        sys.exit(130)
    print(f"\n{'=' * 50}")
    print(f"  Uploaded: {total_up} video(s)")
    print("  Done!")
    print(f"{'=' * 50}")
 if __name__ == "__main__":
    main()
		`@@ -0,0 +1,2 @@`
							`BASE_URL = "https://www.jailbirdz.com"`
							`COOKIE_DOMAIN = "jailbirdz.com" # rookiepy domain filter (no www)`