vojo/docs/ai/android.md

244 lines
18 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Android (Capacitor)
## Requirements
- Node >= 22
- JDK 17+ (21 used in practice)
- Android SDK with platform 36 + build-tools 36.0.0
- SDK location: `/usr/lib/android-sdk`, also set in `android/local.properties`
## Config
- [`capacitor.config.ts`](../../capacitor.config.ts) — `appId: chat.vojo.app`, `webDir: dist`
- `android/` — generated Android Studio project, `targetSdkVersion 36`, `compileSdkVersion 36`, `minSdkVersion 24`
## Build scripts
```bash
npm run build:android:debug # full chain: build → sync → debug APK
npm run build:android:release # full chain: build → sync → release APK
npm run build:android:aab # full chain: build → sync → release AAB
npm run android:sync # sync dist/ → android assets
npm run android:apk:debug # gradle debug build only
```
**APK output**: `android/app/build/outputs/apk/debug/app-debug.apk`
## Versioning
`versionCode` and `versionName` are derived from `git describe --tags --match 'v*'` in [`android/app/build.gradle`](../../android/app/build.gradle), mirroring `resolveAppVersion()` in [`vite.config.js`](../../vite.config.js) so the APK's `versionName` matches `__APP_VERSION__` shown in About. Tag is `v0.2.0`; `patch` is the commit count since that tag (e.g. `v0.2.0-87-g…` → versionName `0.2.87`). When git is unavailable, falls back to `package.json` `version`.
```
versionCode = major * 1_000_000 + minor * 1_000 + patch
```
## Key architecture decisions
- **Bundled build.** `dist/` is copied into the APK — not loaded remotely in a WebView.
- **Service Worker stays active.** Critical for authenticated Matrix media (MSC3916 / Matrix spec v1.11+). DO NOT disable. `resolveServiceWorkerRequests` default `true`.
- **Edge-to-edge.** `EdgeToEdge.enable()` in `MainActivity.java` + `windowLayoutInDisplayCutoutMode: shortEdges`.
- **External links.** Opened via `@capacitor/browser` plugin — see [`src/app/utils/capacitor.ts`](../../src/app/utils/capacitor.ts).
- **Safe-area coloring.** `body` background-color is bound to the folds theme variable `var(--oq6d070)` for consistent safe-area coloring.
- **Safe-area insets.** Applied on `#root` (not `body`) so the theme background extends behind the system bars.
## VSCode tasks
See [`.vscode/tasks.json`](../../.vscode/tasks.json):
- `Deploy to vojo.chat` (Ctrl+Shift+D) — web deploy
- `Deploy to Android (ADB)` (Ctrl+Shift+A) — build + `adb install`
## Push string resources (generated)
Push notification text for Android is generated from `public/locales/{en,ru}.json` (namespace `Push`) by `scripts/gen-push-strings.mjs`. The Gradle build runs this automatically via `GeneratePushStringsTask` registered in `android/app/build.gradle` through AGP `addGeneratedSourceDirectory` — output goes to `build/generated/res/push/<variant>/values{,-ru}/push_strings.xml`. No manual step needed; `./gradlew assembleDebug` handles it.
The task requires `node` in `PATH`. Terminal builds and CI inherit it from the shell. **macOS Android Studio with nvm/fnm:** the GUI app may not see nvm-managed node. Workaround: set `NODE_BIN=/path/to/node` in `android/gradle.properties` (the task reads it via `project.findProperty('NODE_BIN')`) or launch AS from a shell that sources your node manager (`open -a "Android Studio"`).
## Push polling fallback (WorkManager)
Users on networks that block FCM (`mtalk.google.com:5228` — corporate, school
and government whitelist intranets, ~5% of our audience) get zero pushes from
the primary channel. To cover them we run a WorkManager periodic poll of
`/_matrix/client/v3/notifications` as a parallel best-effort delivery channel.
Always on whenever push is enabled — there's no smart-detect-and-switch (FCM
gives no client-visible delivery receipts; see
[push_unifiedpush_phase1.md §11](../plans/push_unifiedpush_phase1.md) for the
full rationale of why this is the only viable shape).
Components:
| Layer | File | Role |
|---|---|---|
| Worker | [`VojoPollWorker.java`](../../android/app/src/main/java/chat/vojo/app/VojoPollWorker.java) | Periodic fetch of `/notifications`, flattens response into Sygnal-shape `Map<String,String>`, routes message/invite → `renderMessageNotification`, RTC ring → `renderMissedCallNotification`. Skips events that are `read=true`, push-rule-suppressed (`actions` lacks `notify`), in NotificationDedup, or with `ts < watermark`. Foreground-gated: doesn't render system notifications while `MainActivity.isInForeground` (still consumes state). Saves a drain cursor when capped at `MAX_PAGES_PER_RUN`. |
| Bridge | [`PollingPlugin.java`](../../android/app/src/main/java/chat/vojo/app/PollingPlugin.java) | Capacitor plugin. JS calls `saveSession` (token + homeserver, seeds watermark on first use to skip historical backlog), `schedule(15)` (unique periodic worker), `saveRoomNames` (room-id → name cache), `cancel` (awaits WorkManager Operation completion) + `clearSession` on disable/logout. |
| Renderers | [`VojoFirebaseMessagingService.java::renderMessageNotification`, `::renderMissedCallNotification`](../../android/app/src/main/java/chat/vojo/app/VojoFirebaseMessagingService.java) | Static, Context-parameterised so the Worker can post into the same notification id space as FCM. Message path uses a **per-room** `roomId.hashCode()` slot — every new event in a room appends to a MessagingStyle conversation rather than stacking as a separate card (see [MessagingStyle pipeline](#messagingstyle-pipeline) below). Missed-call path uses per-event slots so multiple missed rings stack. After successful `nm.notify`, mark the event in NotificationDedup so the polling Worker doesn't re-surface it after the user dismisses an FCM-delivered one. |
| Dedup | [`NotificationDedup.java`](../../android/app/src/main/java/chat/vojo/app/NotificationDedup.java) | Thread-safe shared LRU set of rendered event_ids. Written by both FCM service (background renders AND foreground-skipped events) and Worker (after successful render or foreground-skip). Bounded at 500 entries to comfortably exceed a single Worker run's worst case (`MAX_PAGES_PER_RUN × PAGE_LIMIT = 250`), persisted in `vojo_poll_state` SharedPreferences. |
| JS plugin | [`src/app/plugins/polling.ts`](../../src/app/plugins/polling.ts) | `registerPlugin<PollingPluginIface>('Polling', { web: noop })`. Web has no analogue (SW already wakes for push) — fallback is a no-op. |
| Lifecycle | [`src/app/hooks/usePushNotifications.ts::usePushNotificationsLifecycle`](../../src/app/hooks/usePushNotifications.ts) | Reactive to `usePushEnabled()`. On mount with push enabled: `saveSession` + `schedule` + initial room-name dump. On `visibilitychange → visible`: re-`saveSession` (recovers a 401-cleared credentials slot without remount) + re-dump room names. On unmount or push disable: `cancel` + `clearSession`. |
Why polling is rendered as **missed call** (not CallStyle) for ring events: the
`m.rtc.notification` lifetime is 30 seconds; polling runs at the 15-minute
floor of `PeriodicWorkRequest`. Every ring observed by the Worker is already
stale and the live call long over — rendering CallStyle with ringtone would
phantom-ring a dead call. Missed-call style preserves the "you missed a call
from X" signal without the wrong UX. Live-call delivery for whitelist users
remains a gap; closing it requires a non-FCM live channel (UnifiedPush, see
the stale plan above).
Why we do not need a refresh-token flow: Vojo's homeserver is vanilla Synapse
without MAS/OIDC (see [server-side.md](server-side.md)), so access tokens are
long-lived. A 401 from the Worker logs out the credentials slot and waits for
the next foreground app launch to re-bridge — no native refresh-token logic
required. If we ever migrate to MAS, the Worker needs a refresh path.
Why our source manifest does not declare `RECEIVE_BOOT_COMPLETED`: WorkManager's
library manifest already declares the permission and the `RescheduleReceiver`,
which the manifest merger folds into the merged manifest. Reboot persistence
works end-to-end without our app re-declaring anything. Apps only need to add
the permission themselves when they listen for `BOOT_COMPLETED` for their own
purposes.
Edge cases handled:
- Token rotation (post-MAS migration): currently not bridged from JS to native
on token-rotate events. JS re-saves credentials on every lifecycle re-mount
AND on visibilitychange → visible, so user-driven re-open recovers within
seconds. After a 401 the Worker clears its credentials slot; after a 403
it leaves credentials alone and just skips the cycle (403 is most often a
transient rate-limit, not a dead token).
- First fire after install / re-login: `saveSession` seeds
`KEY_LAST_SEEN_TS` to `System.currentTimeMillis() - 60s` on first write,
so the Worker doesn't render every historical unread `/notifications`
entry as a fresh push. The 60s buffer tolerates device-clock drift ahead
of the homeserver (event `ts` is server-side); without it a fast-clock
device would silently skip fresh events as "older than watermark".
- POST_NOTIFICATIONS revoked at runtime: Worker bails early on
`NotificationManagerCompat.areNotificationsEnabled() == false`. Without
this guard `nm.notify` would throw `SecurityException` per event, leave
the LRU and watermark unadvanced, and re-walk the same backlog every 15
minutes until the user re-grants permission.
- Worker > 10 minutes (Android kill timer): bounded by `MAX_PAGES_PER_RUN=5`
× `PAGE_LIMIT=50` + 30s HTTP timeout per call. Cannot exceed ~3 minutes
in normal operation. Most polls touch only a single page because the ts
watermark short-circuits the loop.
- Large backlog (>250 events accumulated while offline): when a single fire
hits `MAX_PAGES_PER_RUN` before reaching the watermark, the Worker saves
the leftover `next_token` as `KEY_DRAIN_CURSOR` AND snapshots the head ts
of the first run as `KEY_DRAIN_TARGET_TS`. Subsequent fires resume from
that cursor instead of head; the target ts is the fast-forward
destination for the watermark when drain finally completes — without it,
the bounded LRU could evict head events and let the post-drain normal
run re-render them.
- Network unavailable: `NetworkType.CONNECTED` constraint skips the run; next
cycle retries.
- Doze: WorkManager honours maintenance windows. No catch-up — only the next
scheduled fire delivers the accumulated backlog. The Worker walks from the
head of `/notifications` and stops as soon as it reaches the watermark, so a
Doze-extended gap just produces a larger first-page walk.
- Pagination assumes newest-first ordering (Vojo runs vanilla Synapse, whose
`get_push_actions_for_user` issues `ORDER BY stream_ordering DESC`). The
Matrix spec for `/notifications` does not formally mandate this ordering, so
if Vojo ever migrates to a homeserver implementation that paginates oldest-
first (Conduit, Dendrite, …) the `ts < watermark` break would clip new
events. Revisit the Worker before any such migration.
- Already-read events (user read on another client) are skipped via the `read`
field on each `/notifications` entry; their ts still advances the watermark
so they don't get re-walked next poll.
- Muted rooms: `actions` array on each `/notifications` entry is consulted;
events without `notify` (i.e. `dont_notify` from a mute push rule) are
skipped. Without this, the mute toggle wouldn't actually mute polling-
delivered notifications even though Sygnal honours it for FCM.
- User in foreground: Worker doesn't render system notifications while
`MainActivity.isInForeground` (live timeline owns UX). State still
advances so events don't replay on the next backgrounded poll.
- FCM + polling double delivery: NotificationDedup is the single source of
truth — FCM service and Worker both write to it after successful render,
both read it before posting. Even if the user dismisses an FCM-delivered
notification before polling fires, the Worker skips it.
- UTF-8 multi-byte boundaries: `readAll` accumulates raw bytes and decodes
the full buffer once, never per-chunk; otherwise a Cyrillic character
straddling an 8 KB read boundary would become U+FFFD.
- Logout race: `initMatrix.ts::logoutClient`, `clearLocalSessionAndReload`,
and the `SessionLoggedOut` listener in `ClientRoot.tsx` all call
`polling.cancel()` + `polling.clearSession()` synchronously before
`window.location.replace`, so the Worker can't fire one more time with
the stale access_token. `cancel()` awaits the WorkManager `Operation` so
a fast disable → re-enable cycle doesn't race the `KEEP` policy. The
lifecycle effect's unmount cleanup repeats the same calls as
belt-and-suspenders.
Cleanups invoked symmetrically across every logout path:
`useDisablePushNotifications`, `logoutClient`, `clearLocalSessionAndReload`,
the `SessionLoggedOut` listener, and the lifecycle effect's unmount all
call `polling.cancel()` + `polling.clearSession()`.
## MessagingStyle pipeline
Background-rendered message notifications use
`NotificationCompat.MessagingStyle` so multiple events in one room collapse
into an expandable conversation card (WhatsApp / Telegram convention)
rather than each event posting a separate banner. Notification id is
**per-room** (`roomId.hashCode()`), not per-event.
Components:
| Layer | File | Role |
|---|---|---|
| Cache | [`RoomMessageCache.java`](../../android/app/src/main/java/chat/vojo/app/RoomMessageCache.java) | Thread-safe `ConcurrentHashMap<String, ArrayDeque<Entry>>` bounded at 20 messages × 200 rooms. Snapshot is taken INSIDE `compute()` so a concurrent FCM + Worker append on the same room can't race the copy. Mutated by both `VojoFirebaseMessagingService.renderMessageNotification` (FCM service path AND Worker path through the same static helper) and `appendOutgoingMessage` (ReplyReceiver echo). |
| Channels | `vojo_messages_dm_v1` (IMPORTANCE_HIGH) + `vojo_messages_group_v1` (IMPORTANCE_DEFAULT) under `NotificationChannelGroup("vojo_messages_v1")`. Legacy `vojo_messages` is deleted on first creation of v1. Channel split lets users mute group-room noise in OS settings without losing DM alerts. |
| Metadata snapshot | JS bridges `{roomId: {name, isDirect, isEncrypted}}` via `polling.saveRoomNames``KEY_ROOM_NAMES` in `vojo_poll_state`. `loadRoomMetadata` parses tolerantly (legacy `roomId: "name"` falls back to `isDirect=true, isEncrypted=true` for safety). Re-dump triggers: mount, visibility-change, `ClientEvent.AccountData` for `m.direct`, `RoomEvent.Timeline` filtered to `m.room.encryption`. |
| Process-kill recovery | On cache miss, `seedCacheFromActiveNotification` calls `NotificationCompat.MessagingStyle.extractMessagingStyleFromNotification` on the on-shade `StatusBarNotification` to rebuild prior history. Survives process kill; fails gracefully to single-message conversation if the notification was also dismissed. |
| Receivers | [`MarkAsReadReceiver`](../../android/app/src/main/java/chat/vojo/app/MarkAsReadReceiver.java) (POST `/_matrix/client/v3/rooms/{roomId}/receipt/m.read/{eventId}` + dismiss), [`NotificationDismissReceiver`](../../android/app/src/main/java/chat/vojo/app/NotificationDismissReceiver.java) (swipe → clear cache so the next push starts fresh), [`ReplyReceiver`](../../android/app/src/main/java/chat/vojo/app/ReplyReceiver.java) (RemoteInput → PUT `m.room.message` with `m.text` body + optimistic local echo). All read credentials from `vojo_poll_state` SharedPreferences (same lifecycle as `VojoPollWorker`). |
| Receipt-driven dismiss | JS `mx.on(RoomEvent.Receipt)` filters own-user receipts, checks `room.getUnreadNotificationCount(Total) === 0`, calls `polling.dismissRoom(roomId)` → native `nm.cancel + RoomMessageCache.clear`. Mirrors element-web's `Notifier.onRoomReceipt`. Killed-process dismiss is not covered (no JS context to observe the receipt) — acceptable: the next FCM push to that room renders a fresh conversation from cache-empty state. |
Why MessagingStyle vs the old per-event flow: 5 messages in one DM previously
produced 5 separate cards in the shade with redundant title/avatar. The
MessagingStyle conversation matches WhatsApp/Telegram UX and is the documented
Android pattern for messaging apps. See element-android's
`RoomGroupMessageCreator` for the canonical reference.
Why two channels (DM + group) and not per-conversation channels (the
fluffychat approach): per-conversation works for low-room-count clients but
proliferates user-visible settings entries on a Matrix client with dozens of
active rooms. Element-android sidesteps the question with a NOISY/SILENT
split based on push rules; we picked a middle ground — bucketed by DM vs
group room — which mirrors fluffychat's `directChats`/`groupChats`
NotificationChannelGroup setup.
Why reply action is gated on `!isEncrypted`: the Java path has no key
material to sign + encrypt outgoing replies with, so an inline reply in an
E2EE room would send cleartext (Synapse does not enforce the
"encrypted-only" rule, so the leak is real). The snapshot defaults to
`isEncrypted=true` on cache miss and the JS side re-dumps on
`m.room.encryption` state events so the action is dropped within seconds of
a room being switched to E2EE.
Why call-session composite dedup
(`compositeCallDedupKey(roomId, sessionId)`): the legacy per-eventId dedup
misses re-rings of the same call session because each ring is a fresh
`m.rtc.notification` event with a new event_id. We extract the parent call
event_id from `content.m.relates_to.event_id` (Worker JSON parse) /
`content_m.relates_to_event_id` (FCM Sygnal-flatten) and mark the composite
in NotificationDedup the moment we post the first CallStyle. Subsequent
ring events for the same session see the mark and skip silently. Mirrors
element-web's `getIncomingCallToastKey` pattern.
Why edit-collapse (`m.replace`) is **NOT implemented**: requires parsing
`content.m.relates_to.rel_type == "m.replace"` + finding the original event
in the per-room cache and replacing in place. The complication: FCM
payloads (Sygnal-flattened) encode nested keys inconsistently across
deployments (`content_m.relates_to_rel_type` vs
`content_m_relates_to_rel_type` vs dot-preserved variants), and the Worker
parses raw JSON cleanly while FCM hits one of the flattened shapes.
Asymmetric handling (Worker only) creates user-visible drift between
delivery paths. Real-world impact is low — users rarely edit
notification-flagged messages in the seconds-long window before they're
read — so the feature is deferred until we have a uniform key shape from
Sygnal config or until a real-world report justifies the parser complexity.
1. On the phone, enable Wireless debugging, tap "Pair device with pairing code" — note IP, port, 6-digit code.
2. `adb pair <ip>:<pair-port> <code>`
3. `adb connect <ip>:<connect-port>`
The pair port and the connect port are different — don't mix them up.