Post

Icon Progress Update

Krill Codebase Quality Review Report - Progress Update

Progress Update

Krill Codebase Quality Review Report - Progress Update

Review Date: December 01, 2025
Previous Review: November 30, 2025
Scope: /server, /krill-sdk, /shared/commonMain, /composeApp/desktopMain
Platform-specific modules excluded: iOS, Android, WASM
Last Commit: 389dd6c (2025-12-01 23:26:43 UTC)
Kotlin Source Files Analyzed: 154 files


Executive Summary

This report provides an updated analysis of the Krill platform codebase, comparing the current state (Dec 1, 2025) against the previous review (Nov 30, 2025). The analysis reveals minimal progress on critical issues identified in the previous review, with most high-severity issues remaining unresolved.

Progress Status: 🔴 Limited Improvement

Overall Code Quality Score: 68/100 (unchanged)


Overall Code Quality Comparison

CategoryPrevious ScoreCurrent ScoreChangeStatus
Architecture70/10070/100No change🟡
Coroutine Safety55/10058/100+3🟡
Thread Safety50/10050/100No change🔴
Error Handling60/10060/100No change🟡
Code Organization75/10075/100No change🟢
Documentation40/10040/100No change🔴
Completeness70/10070/100No change🟡

Note: The slight improvement in Coroutine Safety (+3 points) is due to desktop entry point improvements and BeaconService cleanup additions, but critical issues remain unresolved.


Progress on Critical Issues from Previous Review

🔴 UNRESOLVED: Thread-Unsafe Mutable Collections

Status: All instances from previous report remain unchanged.

1
2
3
4
5
6
7
8
9
10
11
12
// Server.kt - STILL PRESENT (removed from Server.kt but issue remains in MQTTBroker)
// MQTTBroker.kt:24 - STILL PRESENT
val jobs = mutableMapOf<String, Job>()

// NodeManager.kt:35 - STILL PRESENT
private val nodes: MutableMap<String, NodeFlow> = mutableMapOf()

// NodeEventBus.kt:9 - STILL PRESENT
val subscribers = mutableListOf<NodeEvent>()

// SilentTriggerManager.kt:9 - NEW ISSUE IDENTIFIED
val jobs = mutableMapOf<String, Job>()

Impact: Race conditions and potential crashes when multiple coroutines access these collections simultaneously.

Recommendation: Use ConcurrentHashMap for maps or wrap all access with Mutex.

🟡 PARTIALLY IMPROVED: Orphaned CoroutineScopes

Status: 2 of 6+ identified issues have been addressed.

✅ RESOLVED:

  1. PeerConnector.kt:15 - Now receives scope as parameter instead of creating its own
  2. main.kt (Desktop) - Now properly manages scope lifecycle with cancellation on shutdown

🔴 STILL PRESENT:

1
2
3
4
5
6
7
8
9
10
11
// ClientScreen.kt:80 - NEW LOCATION IDENTIFIED
CoroutineScope(Dispatchers.Default).launch { ... }

// NodeMenuClickCommandHandler.kt:119,133 - NEW LOCATIONS IDENTIFIED  
CoroutineScope(Dispatchers.Default).launch { ... }

// TriggerDialog.kt:50 - NEW LOCATION IDENTIFIED
CoroutineScope(Dispatchers.Default).launch { ... }

// MQTTBroker.kt:92 - STILL PRESENT
jobs[node.id] = CoroutineScope(Dispatchers.Default).launch { ... }

Impact: Memory leaks in UI components and server operations. Each new scope creates a new job that never gets cancelled.

🔴 UNRESOLVED: Security Vulnerabilities

Status: Critical security issues remain unchanged.

  1. CORS anyHost() - STILL PRESENT
    1
    2
    3
    4
    
    // Server.kt:40 - UNCHANGED
    install(CORS) {
        anyHost()
    }
    

    Impact: Any origin can access the API. Production deployment risk.

  2. Hardcoded Certificate Passwords - STILL PRESENT
    1
    2
    3
    4
    5
    6
    7
    
    // KtorConfig.kt:19,28-29 - UNCHANGED
    password = "changeit".toCharArray()
    keyStorePassword = { "changeit".toCharArray() }
    privateKeyPassword = { "changeit".toCharArray() }
       
    // MQTTBroker.kt:29 - UNCHANGED
    keyStorePassword = "changeit"
    

    Impact: Severe security vulnerability. Certificates can be compromised.

🟡 PARTIALLY IMPROVED: Potential Memory Leaks

Status: Some improvements in lifecycle management, but core issues remain.

✅ IMPROVED:

  • BeaconService.kt:22-25 - Now has a stop() method that cancels jobs
  • Desktop app now properly cancels scope on shutdown

🔴 STILL PRESENT:

  1. NodeManager observer pattern - Race conditions in job management still present
  2. ServerNodeProcess - lateinit var job without guaranteed cleanup
  3. NodeEventBus - Subscribers never cleaned up (no unsubscribe mechanism beyond ApplicationStopped)

🔴 UNRESOLVED: Global State and Singletons

Status: No change. All singleton objects with mutable state remain.

  • NodeEventBus (mutable subscribers list)
  • ComposeCore (mutable state flows)
  • ScreenCore (mutable state flows)
  • MQTTBroker (mutable jobs map)
  • serverScope (global coroutine scope)

Impact: Testing difficulties, potential race conditions, unclear ownership.


Entry Point Analysis Update

1. Server Entry Point: Application.kt

Changes Detected: None significant

Issues Status:

SeverityIssueStatus
🔴 HIGHrunBlocking blocks startupUNCHANGED
🟡 MEDIUMGlobal mutable isServer flagUNCHANGED
🟡 MEDIUMRedundant repeat(headerPins.size) loopUNCHANGED

2. Desktop Entry Point: main.kt

Changes Detected: ✅ Improvements made

Before (Nov 30):

1
// No scope management

After (Dec 1):

1
2
3
4
5
6
val scope = CoroutineScope(SupervisorJob() + Dispatchers.IO)
// ... 
App(scope, demo) {
    scope.cancel("App is shutting down")
    this@application.exitApplication()
}

Improvement: Desktop app now properly creates and cancels its coroutine scope, preventing resource leaks on shutdown.

Remaining Issue:

  • deleteReadyFile() still called before window creation (could leave stale file on crash)

3. Server Module: Server.kt

Changes Detected: None

All issues from previous report remain:

  • Global serverScope never properly cancelled on error scenarios
  • CORS anyHost() security risk
  • NodeEventBus subscriptions cleanup incomplete

New Issues Identified

🔴 NEW: Unmanaged Scopes in UI Components

Four new locations of unmanaged CoroutineScope(Dispatchers.Default).launch found in UI code:

  1. ClientScreen.kt:80 - Creates orphaned scope for pairing operations
  2. NodeMenuClickCommandHandler.kt:119 - Creates orphaned scope for node deletion
  3. NodeMenuClickCommandHandler.kt:133 - Creates orphaned scope for child updates
  4. TriggerDialog.kt:50 - Creates orphaned scope for trigger updates

Recommendation: Use rememberCoroutineScope() in Compose code or pass parent scope.

🟡 NEW: SilentTriggerManager Incomplete Implementation

Location: server/src/main/kotlin/krill/zone/data/SilentTriggerManager.kt

1
2
3
4
5
6
7
8
9
10
suspend fun waitJob(trigger: Node) {
    val meta = trigger.meta as TriggerMetaData
    delay(meta.value.toLong())
    Logger.i("dataengine: waiting job for ${trigger.id} exited")
    //TODO - Line 23
}

override suspend fun post(snapshot: Snapshot): Boolean {
    TODO("Not yet implemented") - Line 28
}

Impact: Silent alarm feature is non-functional. Calling post() will throw NotImplementedError.


Coroutine Scope Hierarchy - Updated

graph TB
    subgraph "Server Scopes"
        SS[serverScope<br/>Global, SupervisorJob+Default]
        SS --> SS1["MQTTBroker jobs<br/>❌ Thread-unsafe map"]
        SS --> SS2["ServerSocketManager broadcasts"]
        SS --> SS3["Node observation jobs"]
        SS --> SS4["BeaconService<br/>✅ Now has stop()"]
        style SS fill:#90EE90
        style SS1 fill:#FFB6C1
        style SS4 fill:#FFFACD
    end
    
    subgraph "Client Scopes - Desktop"
        DM[Desktop main scope<br/>✅ NEW: Properly cancelled]
        DM --> DM1["App composable"]
        style DM fill:#90EE90
    end
    
    subgraph "Client Scopes - App"
        CS[ComposeCore.scope<br/>Global, Default+SupervisorJob]
        CS --> CS1["Node interactions"]
        CS --> CS2["checkForInteraction"]
        style CS fill:#90EE90
    end
    
    subgraph "Unmanaged Scopes - UI" 
        US1["CoroutineScope(Default)<br/>ClientScreen.kt:80<br/>❌ NEW"]
        US2["CoroutineScope(Default)<br/>NodeMenuClickCommandHandler:119,133<br/>❌ NEW"]
        US3["CoroutineScope(Default)<br/>TriggerDialog.kt:50<br/>❌ NEW"]
        US4["CoroutineScope(Default)<br/>MQTTBroker.kt:92<br/>❌ UNCHANGED"]
        style US1 fill:#FFB6C1
        style US2 fill:#FFB6C1
        style US3 fill:#FFB6C1
        style US4 fill:#FFB6C1
    end
    
    subgraph "NodeManager Scope"
        NMS[NodeManager.scope<br/>Instance, Default+SupervisorJob]
        NMS --> NMS1["update posts"]
        NMS --> NMS2["delete operations"]
        NMS --> NMS3["observe flows"]
        style NMS fill:#FFFF99
    end
    
    subgraph "ScreenCore Scope"
        SCS[screenCore.scope<br/>Global, SupervisorJob only]
        style SCS fill:#FFFF99
    end

Legend:

  • 🟢 Green (#90EE90): Properly managed - [GOOD]
  • 🟡 Yellow (#FFFF99 or #FFFACD): Needs review or partial fix - [REVIEW]
  • 🔴 Pink (#FFB6C1): Unmanaged/leaking - [CRITICAL]

TODO Items Analysis

Total TODOs Found: 12

Critical TODOs (Blocking Functionality)

PriorityLocationIssueStatus
🔴 HIGHSilentTriggerManager.kt:23,28post() not implemented, throws errorUNCHANGED
🟡 MEDDataStore.kt:33Trigger checking not implementedUNCHANGED
🟡 MEDTriggerEventProcessor.kt:31Needs node refresh for latest snapshotUNCHANGED

Configuration TODOs

PriorityLocationIssueStatus
🟡 MEDNodeMetaData.ktPort 8442 hardcodedUNCHANGED
🟡 MEDMQTTBroker.kt:20Send smaller DTO instead of full NodeUNCHANGED

Documentation TODOs

PriorityLocationIssueStatus
🟢 LOWHardwareDiscovery.kt:13Document i2cdetect setupUNCHANGED
🟢 LOWHardwareDiscovery.kt:101HAT metadata parsing incompleteUNCHANGED

Detailed Progress Breakdown

What Improved ✅

  1. Desktop Lifecycle Management - Desktop app now has proper scope cancellation
  2. BeaconService Cleanup - Added stop() method for job cleanup
  3. PeerConnector Scope Management - Now receives scope instead of creating its own

What Stayed the Same 🟡

  1. Architecture and code organization
  2. Error handling patterns
  3. Documentation level
  4. Feature completeness

What Got Worse ❌

  1. New unmanaged scopes discovered in UI components (4 locations)
  2. New thread-unsafe collection found in SilentTriggerManager
  3. More clarity on incomplete implementations (SilentTriggerManager.post())

Updated Recommendations

Immediate Action Items (Week 1-2)

  1. Fix Thread Safety - CRITICAL
    1
    2
    3
    4
    5
    6
    7
    
    // Replace all instances:
    // OLD: val jobs = mutableMapOf<String, Job>()
    // NEW: val jobs = ConcurrentHashMap<String, Job>()
       
    // Or wrap with Mutex:
    private val jobsLock = Mutex()
    private val jobs = mutableMapOf<String, Job>()
    
  2. Fix Security Vulnerabilities - CRITICAL
    1
    2
    3
    4
    5
    6
    7
    8
    
    // Server.kt - Replace anyHost()
    install(CORS) {
        allowHost("localhost:*")
        allowHost(System.getenv("ALLOWED_ORIGINS") ?: "")
    }
       
    // All files - Replace hardcoded passwords
    keyStorePassword = System.getenv("KEYSTORE_PASSWORD") ?: error("KEYSTORE_PASSWORD not set")
    
  3. Fix UI Scope Leaks - HIGH
    1
    2
    3
    4
    5
    
    // In Composables, use:
    val scope = rememberCoroutineScope()
    scope.launch { ... }
       
    // Or pass parent scope as parameter
    

Short Term (Week 3-4)

  1. Implement SilentTriggerManager.post() - Complete TODO implementation
  2. Add NodeEventBus.unsubscribe() - Allow proper cleanup
  3. Fix runBlocking in Application.kt - Use proper coroutine startup

Medium Term (Month 2)

  1. Refactor NodeManager - Break down god object into smaller components
  2. Add Unit Tests - Focus on NodeManager, triggers, and data store
  3. Configuration Management - Externalize all hardcoded values

Long Term (Month 3+)

  1. Security Audit - Comprehensive review before production
  2. Performance Testing - Load test MQTT and WebSocket
  3. Documentation - API docs, architecture guides, deployment guides

Feature Implementation Gap Analysis

Feature Files Reviewed

  • 36 JSON feature definitions in /content/feature/
  • All major features have specifications

Implementation Status

Feature CategorySpec StatusImplementationGap
DataPoint Types✅ Complete✅ ImplementedNone
Triggers✅ Complete🟡 PartialSilentAlarmMs incomplete
Rule Engine✅ Complete✅ ImplementedMinor TODOs
Serial Devices✅ Complete🟡 PartialHAT meta missing
Server/Client✅ Complete✅ ImplementedNone
Calculations✅ Complete✅ ImplementedNone

Notable Gap: Silent alarm trigger has spec but implementation throws NotImplementedError.


Code Quality Metrics

Complexity Analysis

1
2
3
4
5
6
Total Kotlin Files (in scope): 154
Files with Critical Issues: 8
Files with TODOs: 12
Global Mutable Singletons: 5
Unmanaged CoroutineScopes: 7 (up from 3)
Thread-unsafe Collections: 4 (same as before)

Technical Debt Score

Previous: 68/100
Current: 68/100
Trend: → Stable (minimal change)


Comparison: Previous vs Current State

Positive Changes

AreaChangeImpact
Desktop EntryAdded scope lifecycle managementPrevents resource leaks
BeaconServiceAdded stop() methodBetter cleanup capability
PeerConnectorNow accepts scope parameterBetter structured concurrency

Negative/Neutral Changes

AreaChangeImpact
UI ComponentsNew unmanaged scopes foundIncreased leak potential
Thread SafetyNo improvementsStill high risk
SecurityNo improvementsStill critical risk
DocumentationNo improvementsStill poor

Issues Resolved: 2

Issues Introduced/Discovered: 5

Net Change: -3 (slightly worse)


Production Readiness Assessment

Before Deployment, Must Fix:

  • 🔴 All hardcoded passwords (5 instances)
  • 🔴 CORS anyHost() security issue
  • 🔴 Thread-unsafe mutable collections (4 instances)
  • 🔴 Unmanaged coroutine scopes (7 instances)
  • 🔴 SilentTriggerManager incomplete implementation

Should Fix:

  • 🟡 runBlocking in main server startup
  • 🟡 NodeEventBus subscriber cleanup
  • 🟡 Global mutable state in singletons
  • 🟡 Hardcoded file paths and ports
  • 🟡 Missing unit tests

Nice to Have:

  • 🟢 Better error handling
  • 🟢 Code documentation
  • 🟢 HAT metadata parsing
  • 🟢 Refactor NodeManager god object

Current Production Readiness:NOT READY
Estimated Time to Production Ready: 4-6 weeks (if addressed systematically)


Conclusion

The Krill codebase shows minimal improvement since the previous review (Nov 30, 2025). While there are some positive changes in scope lifecycle management for the desktop client and beacon service, the critical security and thread safety issues remain unresolved.

Key Takeaways:

  1. Security is still a blocker - Hardcoded passwords and CORS anyHost() must be fixed
  2. Thread safety needs urgent attention - Race conditions are production risks
  3. Memory leak potential remains high - Unmanaged scopes actually increased
  4. Code quality is stable but not improving - No significant architectural changes
  5. Small wins - Desktop app lifecycle and BeaconService improvements show progress is possible
  1. Immediate: Create issues for all 🔴 HIGH severity items
  2. Week 1: Fix security vulnerabilities (passwords, CORS)
  3. Week 2: Fix thread safety issues (use ConcurrentHashMap or Mutex)
  4. Week 3: Fix coroutine scope leaks in UI
  5. Week 4: Add unit tests for critical paths
  6. Month 2: Address architectural concerns

The platform has solid foundations and good architectural concepts, but needs focused effort on production hardening before deployment.


Updated Review Prompt for Future Analysis

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
Review the Kotlin code in the Krill platform focusing on:

**Scope:** 
- `/server/src/main/kotlin/krill/zone/` - Ktor server implementation
- `/krill-sdk/src/commonMain/kotlin/krill/zone/` - Core SDK shared code
- `/krill-sdk/src/jvmMain/kotlin/krill/zone/` - JVM-specific implementations
- `/shared/src/commonMain/kotlin/krill/zone/` - Shared multiplatform code
- `/composeApp/src/commonMain/kotlin/krill/zone/` - Compose UI code
- `/composeApp/src/desktopMain/kotlin/krill/zone/` - Desktop-specific code

**Exclude:** iOS, Android, WASM platform-specific modules

**Entry Points:**
1. Server: `server/src/main/kotlin/krill/zone/Application.kt` (Ktor Netty server)
2. Desktop: `composeApp/src/desktopMain/kotlin/krill/zone/main.kt` (Compose Desktop)

**Analysis Required:**
1. **Coroutine Analysis:**
   - Map all CoroutineScope declarations and their lifecycle management
   - Identify orphaned scopes (created but never cancelled)
   - Check Job management in maps/collections
   - Evaluate structured concurrency usage
   - Verify proper use of SupervisorJob vs Job
   - Check for scope cancellation in cleanup methods

2. **Thread Safety:**
   - Find mutable collections accessed from coroutines
   - Check for synchronization mechanisms (Mutex, lock, synchronized, ConcurrentHashMap)
   - Identify potential race conditions in Node/Job management
   - Review StateFlow/MutableStateFlow usage
   - Check for proper volatile or atomic variables

3. **Memory Leaks:**
   - Check observer/subscription patterns for cleanup
   - Review StateFlow/SharedFlow usage and collection lifecycle
   - Identify retained references in singletons
   - Verify Job cancellation in maps/lists
   - Check for proper cleanup in stop/close/dispose methods

4. **Architecture:**
   - Evaluate module dependencies and coupling
   - Check for circular dependencies
   - Review singleton usage patterns
   - Identify god objects (classes with too many responsibilities)
   - Check for proper separation of concerns

5. **Security:**
   - Scan for hardcoded credentials (passwords, tokens, keys)
   - Review CORS configuration
   - Check authentication/authorization patterns
   - Identify exposed admin endpoints
   - Review file path security (directory traversal risks)

6. **Feature Definitions:**
   - Read `/content/feature/*.json` for feature specifications
   - Cross-reference with `KrillApp.kt` implementation
   - Identify gaps between spec and implementation
   - Check for incomplete implementations (TODO, NotImplementedError)

7. **Error Handling:**
   - Check for silent exception catching
   - Verify proper error propagation
   - Review logging patterns
   - Check for resource cleanup in error paths

8. **Code Quality:**
   - Count TODO comments and categorize by severity
   - Identify code duplication
   - Review naming conventions
   - Check documentation coverage

**Output Format:**
- Overall quality score (0-100) with breakdown by category
- Comparison with previous report (what improved, what got worse)
- Mermaid diagrams for:
  - Entry point flow
  - Coroutine scope hierarchy (with status indicators)
  - Data flow architecture
- Tables for issues (severity, location, description, status vs previous)
- Progress summary (resolved, unchanged, new issues)
- TODO items with agent prompts for completion
- Production readiness checklist
- Prioritized action items with time estimates
- Professional evaluation summary

**Previous Issues to Re-verify:**
- serverScope lifecycle management in Server.kt (Line 29)
- Jobs map synchronization in MQTTBroker (Line 24), SilentTriggerManager (Line 9)
- NodeManager.nodes map thread safety (Line 35)
- NodeEventBus subscriber cleanup (Line 9)
- CORS anyHost() security (Server.kt Line 40)
- Hardcoded passwords in KtorConfig.kt (Lines 19, 28-29) and MQTTBroker.kt (Line 29)
- Unmanaged CoroutineScope creation in UI components
- SilentTriggerManager incomplete implementation (Lines 23, 28)
- runBlocking in Application.kt main() (Line 16)

**Review Guidelines:**
1. Compare each issue with previous report to determine status: Resolved ✅, Unchanged 🟡, Worse ❌, or New 🆕
2. Provide specific line numbers for all issues
3. Include code snippets showing before/after for any changes
4. Calculate an overall progress score
5. Be objective and data-driven in assessments
6. Provide actionable recommendations with priority levels
7. Include estimated effort for each recommendation

Report generated by automated code review
Review Date: December 01, 2025
Previous Review: November 30, 2025
Days Between Reviews: 1
Reviewer: AI Code Analysis System

This post is licensed under CC BY 4.0 by the author.