aboutsummaryrefslogtreecommitdiff
path: root/CONTRIBUTING.md
blob: 3877ebfa2a5a929b0ac698f5ed6d3e48118ca4a7 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
# Contributing to oreilly-epub

First off — thank you for your interest in contributing! \
This document explains how to set up your environment, run checks/tests, make
changes, and submit pull requests (PRs).

> **License note:** By contributing, you agree that your contributions will be
> licensed under the project's license (GPLv3). See `LICENSE.txt` in the repo.

## Project goals

This tool downloads an O'Reilly (Safari) book by ID and assembles a valid EPUB
by fetching metadata/chapters/assets, cleaning up XHTML, injecting styles, and
writing a standards-compliant archive. The pipeline includes:

- Authenticated HTTP client using a user-provided `cookies.json`.
- Parallel file downloads with configurable concurrency (default 4, max 8).
- XHTML/OPF processing to ensure EPUB validity (e.g., void tags, stylesheet
links, `dc:description` injection).
- CI/CD pipelines that run `rustfmt`, `clippy`, tests, and build release
artifacts on tag pushes.

## Getting started

### Prerequisites

- **Rust toolchain** (stable) with `cargo`. The CI uses
`dtolnay/rust-toolchain@stable`, so match stable locally.
- For actual end-to-end runs: a **`cookies.json`** file (flat key-value JSON of
cookies for `https://learning.oreilly.com`). Place it in one of:
  1. `--cookies <path>` argument
  2. `${XDG_CONFIG_HOME}/oreilly-epub/cookies.json`
  3. the project's current directory (`./cookies.json`) \
     The app will search these locations in this order.

> **Note:** The app is asynchronous (Tokio) and uses `reqwest`. No special
> services are required to build and run locally.

### Repository layout (high-level)

- `src/main.rs` – CLI, argument parsing, orchestration (fetch metadata, pages,
downloads, build EPUB).
- `src/http_client.rs` – authenticated client from `cookies.json`.
- `src/epub.rs` – download pipeline and zip/EPUB creation (mimetype first,
`META-INF/container.xml`, etc.).
- `src/xml.rs` – XHTML/OPF rewriting (void tags, attribute rewrite, stylesheet
linking, description injection).
- `src/models.rs` – data models for APIs (EPUB, chapters, files, pagination).

### Quick start

```bash
# 1) Clone
git clone <REPO_URL>
cd oreilly-epub

# 2) (Optional) Place cookies.json (see above)
# Example:
# cp /path/to/cookies.json .

# 3) Build
cargo build

# 4) Check format & lints locally
cargo fmt --all
cargo clippy --all-targets --all-features -- -D warnings

# 5) Run unit tests
cargo test --all

# 6) Try it (example book id)
cargo run -- 9781492097039 --parallel 4
# Add --skip-download if you already have cached files
# Add --cookies ./cookies.json to point to a specific file
```

## Development workflow

### Style and linting

- **Formatting:** `cargo fmt --all` (CI runs `cargo fmt --all -- --check`).
- **Linting:** `cargo clippy --all-targets --all-features -- -D warnings` (CI
treats warnings as errors).

> PRs must pass both format and clippy checks locally before you push.

## Before you open a PR

### Commit messages

- Use clear, imperative subject lines: “Fix EPUB container.xml path
resolution”, not “fixed stuff”.
- Wrap body lines at \~72 characters when practical.
- Reference issues with `Fixes #123` / `Closes #123` when appropriate.

### Branch naming

Use short, descriptive branches, e.g.:

- `feat/parallelism-8-limit`
- `fix/xhtml-void-tags`
- `docs/contributing`

### Checklists

Before pushing:

- [ ] `cargo fmt --all` produces no diffs.
- [ ] `cargo clippy --all-targets --all-features -- -D warnings` passes locally.
- [ ] `cargo test --all` is green.
- [ ] If you touched XML/OPF/XHTML handling, tested at least one real book
locally (if possible).
- [ ] Updated documentation, comments, or usage text when applicable.