Comprehensive Kotlin Code Review
Comprehensive Kotlin Code Review
Krill Platform - Comprehensive Kotlin Code Review
Date: 2025-12-03
Reviewer: GitHub Copilot Coding Agent
Scope: Server, SDK, Shared, and Compose Desktop modules (excluding iOS, Android, WASM)
Total Files Analyzed: 128 Kotlin files
Total Lines of Code: ~2,747 lines in core modules
Executive Summary
The Krill platform is a Kotlin multiplatform IoT automation system with a Ktor server backend, SDK for shared logic, and Compose Desktop UI. The codebase demonstrates solid architectural patterns with structured concurrency and event-driven design. However, several critical issues around coroutine lifecycle management, thread safety, and memory leaks require immediate attention.
Overall Quality Score: 72/100
Breakdown:
- Architecture & Design: 80/100
- Coroutine Management: 60/100
- Thread Safety: 65/100
- Memory Management: 70/100
- Code Quality: 85/100
- Security: 70/100
1. Entry Point Flow Analysis
1.1 Server Entry Point Flow
graph TD
A[Application.kt main] --> B[embeddedServer Netty]
B --> C[Application.module]
C --> D[Create serverScope<br/>Dispatchers.Default + SupervisorJob]
C --> E[Initialize MQTTBroker]
C --> F[Initialize PiManager]
C --> G[configurePlugins]
C --> H[Configure Routing]
C --> I[ServerLifecycleManager]
I --> J[ApplicationStarted Event]
J --> K[broker.start]
J --> L[subscribeToNodeEvents]
I --> M[ServerReady Event]
M --> N[pi.initPins]
I --> O[ApplicationStopping Event]
O --> P[scope.cancel]
D --> Q[initializeHost]
Q --> R[nm.init - NodeManager]
Q --> S[BeaconService.start]
Q --> T[SerialDirectoryMonitor.start]
1.2 Desktop Entry Point Flow
graph TD
A[main.kt] --> B[application block]
B --> C[Create CoroutineScope<br/>SupervisorJob + Dispatchers.IO]
B --> D[Window composable]
D --> E[App composable]
E --> F[UI Rendering]
D --> G[onCloseRequest]
G --> H[scope.cancel]
G --> I[exitApplication]
2. Coroutine Scope Hierarchy
2.1 Scope Mapping
graph TD
A[Server.kt serverScope] --> B[MQTTBroker]
A --> C[PiManager]
A --> D[ServerLifecycleManager]
A --> E[NodeManager nm]
E --> F[NodeObserver<br/>ServerNodeObserver]
F --> G[Jobs Map<br/>per Node ID]
A --> H[BeaconService]
A --> I[SerialDirectoryMonitor]
I --> J[SerialFileMonitor<br/>per serial device]
A --> K[ComputeEngineInternals]
K --> L[Jobs Map<br/>per compute node]
A --> M[SilentTriggerManager]
M --> N[Jobs Map<br/>per trigger]
D1[Desktop main.kt scope] --> D2[App UI]
E --> E1[NodeManager.scope<br/>ORPHANED - never cancelled!]
E1 --> E2[Internal jobs for<br/>node updates/deletion]
2.2 Critical Issues Identified
ISSUE #1: Orphaned Scope in NodeManager
- Severity: CRITICAL
- Location:
krill-sdk/src/commonMain/kotlin/krill/zone/node/NodeManager.kt:31 - Description: NodeManager creates its own CoroutineScope that is never cancelled
1
private val scope = CoroutineScope(Dispatchers.Default + SupervisorJob())
This scope is created independently and has no cleanup mechanism, leading to resource leaks when the application shuts down.
ISSUE #2: NodeObserver Scope Lifecycle
- Severity: HIGH
- Location:
krill-sdk/src/jvmMain/java/krill/zone/node/NodeObserver.jvm.kt:9 - Description: Observer creates its own scope which is never cancelled
1 2 3 4 5 6 7
actual val observer: NodeObserver by lazy { if (SystemInfo.isServer()) { ServerNodeObserver(CoroutineScope(Dispatchers.IO)) } else { DefaultNodeObserver(CoroutineScope(Dispatchers.IO)) } }
The scope is created but never cleaned up. While there’s a
close()method, it’s unclear if it’s ever called.
3. Thread Safety Analysis
3.1 Mutable Collections Without Synchronization
| File | Line | Collection | Accessed From | Synchronization | Severity |
|---|---|---|---|---|---|
| NodeManager.kt | 32 | nodes: MutableMap<String, NodeFlow> | Multiple coroutines | ❌ None | CRITICAL |
| MQTTBroker.kt | 23 | Uses Mutex ✓ | Coroutines | ✅ Mutex | Good |
| NodeEventBus.kt | 9 | subscribers: MutableList<NodeEvent> | Multiple threads | ❌ None | HIGH |
| ServerSocketManager.jvm.kt | 15 | sessions: MutableSet<DefaultWebSocketServerSession> | Coroutines | ❌ None | HIGH |
| ServerNodeObserver.kt | 9 | jobs: MutableMap<String, Job> | Coroutines | ❌ None | MEDIUM |
| SerialDirectoryMonitor.kt | 10 | jobs: MutableMap<String, Job> | Single coroutine loop | ✅ OK | Low |
| ComputeEngineInternals.kt | 23 | jobs: MutableMap<String, Job> | Coroutines | ❌ None | MEDIUM |
| SilentTriggerManager.kt | 9 | jobs: MutableMap<String, Job> | Coroutines | ❌ None | MEDIUM |
| PiManager.kt | - | ids: MutableMap<Int, String> | Pi4J callbacks | ❌ None | MEDIUM |
3.2 Race Condition Examples
ISSUE #3: NodeManager.nodes Race Condition
- Severity: CRITICAL
- Location:
NodeManager.kt:32-176 - Description: The
nodesmap is accessed from multiple coroutines without synchronization:update()method modifies the mapnodes()method reads the mapdelete()method removes entriesreadNode()method reads and potentially modifies
Potential Issues:
- ConcurrentModificationException during iteration
- Lost updates when concurrent modifications occur
- Inconsistent state between checks and operations
Example Race:
1
2
3
4
5
6
7
// Thread 1 in update()
if (nodes[node.id] == null) { // Check
nodes[node.id] = NodeFlow.Success(...) // Modify - could be overwritten
}
// Thread 2 in delete()
nodes.remove(node.id) // Could happen between check and modify above
ISSUE #4: NodeEventBus Thread Safety
- Severity: HIGH
- Location:
krill-sdk/src/commonMain/kotlin/krill/zone/node/NodeEventBus.kt:9-20 - Description: Subscribers list is mutable and accessed from multiple threads
1 2 3 4 5 6 7 8 9 10 11 12 13
object NodeEventBus : NodeEvent { val subscribers = mutableListOf<NodeEvent>() fun subscribe(subscriber: NodeEvent) { subscribers.add(subscriber) // Not thread-safe } override fun post(host: Node) { subscribers.forEach { // ConcurrentModificationException possible it.post(host) } } }
ISSUE #5: ServerSocketManager Session Set
- Severity: HIGH
- Location:
shared/src/jvmMain/kotlin/krill/zone/server/ServerSocketManager.jvm.kt:15-23 - Description: Sessions set modified from WebSocket coroutines without synchronization ```kotlin private val sessions = mutableSetOf
()
suspend fun broadcast(node: Node) { sessions.forEach { session -> // Could be modified during iteration runCatching { session.sendSerialized(SocketData(node)) } } }
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
---
## 4. Memory Leak Analysis
### 4.1 Observer/Subscription Cleanup
**ISSUE #6: NodeEventBus Subscriber Cleanup**
- **Severity:** HIGH
- **Location:** `NodeEventBus.kt`
- **Description:** No mechanism to unsubscribe from NodeEventBus
- **Impact:** Subscribers added in `ServerLifecycleManager.subscribeToNodeEvents()` are never removed, causing memory leaks
**Lifecycle Issue:**
```kotlin
// In ServerLifecycleManager.onStarted()
NodeEventBus.subscribe { node ->
// Lambda captures scope and other references
scope.launch { ... }
}
// No corresponding unsubscribe in onStopped()
4.2 StateFlow Collection Lifecycle
ISSUE #7: Uncollected Flow Subscriptions
- Severity: MEDIUM
- Location:
ServerNodeObserver.kt:27-72 - Description: Nested flow collection without clear cancellation
1 2 3 4
n.node.collect { collectedNode -> // Inner collection - if outer job is cancelled, // this should also stop, but lifecycle is complex }
4.3 Singleton Analysis
Singletons Found:
NodeEventBus- object, holds mutable state ⚠️ServerSocketManager- object, holds session set ⚠️nm(NodeManager) - lazy singleton, never cleaned up ⚠️
ISSUE #8: Lazy Singleton Initialization
- Severity: MEDIUM
- Location:
NodeManager.kt:14-22 - Description: NodeManager is a lazy singleton that creates resources (scope, maps) but never cleans them up
1 2 3
val nm: NodeManager by lazy { return@lazy NodeManager(...) }
5. Architecture Analysis
5.1 Module Dependencies
graph LR
Server[server/] --> SDK[krill-sdk/]
Server --> Shared[shared/]
Desktop[composeApp/] --> SDK
Desktop --> Shared
SDK --> |platform-specific| JVM[jvmMain/]
SDK --> |platform-specific| Android[androidMain/]
SDK --> |common| Common[commonMain/]
Strengths:
- Clean separation between server and client code
- Proper use of Kotlin Multiplatform structure
- Shared SDK for common logic
Issues:
- Circular dependency risk:
nmsingleton accessed across modules - Global state in singletons creates tight coupling
- No clear dependency injection pattern
5.2 Feature Implementation vs Specification
Cross-referenced /content/feature/*.json with KrillApp.kt implementation:
| Feature | Spec State | Implementation | Gap |
|---|---|---|---|
| KrillApp.Server | ROADMAP | ✅ Implemented | ✓ |
| KrillApp.Server.Pin | ROADMAP | ✅ Implemented | ✓ |
| KrillApp.DataPoint | ROADMAP | ✅ Implemented | ✓ |
| KrillApp.DataPoint.Trigger | ROADMAP | ✅ Implemented | ✓ |
| KrillApp.DataPoint.CalculationEngine | ROADMAP | ✅ Implemented | ✓ |
| KrillApp.DataPoint.Compute | ROADMAP | ✅ Implemented | ✓ |
| KrillApp.RuleEngine | ROADMAP | ⚠️ Partial | Missing: When, Do, Unless, Until |
| KrillApp.RuleEngine.RuleEngineWhen.CronTimer | ROADMAP | ❌ Not found | ❌ Gap |
| KrillApp.RuleEngine.RuleEngineWhen.IncomingWebHook | ROADMAP | ❌ Not found | ❌ Gap |
| KrillApp.RuleEngine.Execute.OutgoingWebHook | ROADMAP | ❌ Not found | ❌ Gap |
| KrillApp.RuleEngine.Execute.ExecuteScript | ROADMAP | ❌ Not found | ❌ Gap |
| KrillApp.RuleEngine.RuleEngineCleanup | ROADMAP | ❌ Not found | ❌ Gap |
| KrillApp.SerialDevice | ROADMAP | ✅ Implemented | ✓ |
ISSUE #9: Incomplete RuleEngine Implementation
- Severity: MEDIUM
- Description: RuleEngine features are defined in KrillApp.kt but lack processor implementations
- Impact: Features appear in UI but don’t function
6. Security Issues
6.1 Previous Issues Re-verification
ISSUE #10: CORS Configuration - NOT FOUND
- Severity: INFO
- Status: ✅ No CORS anyHost() found
- Note: CORS plugin is not configured in the current codebase. If this is needed, it should be added with proper host restrictions.
ISSUE #11: Hardcoded Credentials
- Severity: HIGH
- Location: Multiple files
- Description: Hardcoded passwords found:
server/src/main/kotlin/krill/zone/server/KtorConfig.kt:20- “changeit”server/src/main/kotlin/krill/zone/server/mqtt/MQTTBroker.kt:28- “changeit”
6.2 Data Flow Architecture
graph TD
Serial[Serial Devices] --> SDM[SerialDirectoryMonitor]
SDM --> SFM[SerialFileMonitor]
SFM --> Node[Node Updates]
Node --> NM[NodeManager]
NM --> NEB[NodeEventBus]
NEB --> SLM[ServerLifecycleManager]
SLM --> MQTT[MQTTBroker]
SLM --> WS[ServerSocketManager]
SLM --> BC[Multicast Beacon]
NM --> Obs[NodeObserver]
Obs --> Engines[Engines]
Engines --> CE[ComputeEngine]
Engines --> DE[DataEngine]
Engines --> RE[RuleEngine]
CE --> DS[DataStore]
Node --> StateFlow[MutableStateFlow<Node>]
StateFlow --> Collectors[Flow Collectors]
7. Critical Issues Summary
High Priority (Must Fix)
| # | Issue | Severity | Location | Impact |
|---|---|---|---|---|
| 1 | NodeManager orphaned scope | CRITICAL | NodeManager.kt:31 | Resource leak, coroutines never cleaned up |
| 2 | NodeObserver scope lifecycle | HIGH | NodeObserver.jvm.kt:9 | Memory leak on server shutdown |
| 3 | NodeManager.nodes race condition | CRITICAL | NodeManager.kt:32 | Data corruption, crashes |
| 4 | NodeEventBus thread safety | HIGH | NodeEventBus.kt:9 | ConcurrentModificationException |
| 5 | ServerSocketManager sessions | HIGH | ServerSocketManager.jvm.kt:15 | ConcurrentModificationException |
| 6 | NodeEventBus subscriber cleanup | HIGH | NodeEventBus.kt | Memory leak |
| 11 | Hardcoded credentials | HIGH | Multiple files | Security risk |
Medium Priority (Should Fix)
| # | Issue | Severity | Location | Impact |
|---|---|---|---|---|
| 7 | Uncollected flow subscriptions | MEDIUM | ServerNodeObserver.kt | Potential memory leak |
| 8 | Lazy singleton cleanup | MEDIUM | NodeManager.kt:14 | Resource not released |
| 9 | Incomplete RuleEngine | MEDIUM | KrillApp.kt | Features don’t work |
Low Priority (Nice to Have)
- Add structured logging with correlation IDs
- Implement proper dependency injection
- Add metrics/monitoring for coroutine health
- Document lifecycle expectations
8. Recommendations & TODO Items
TODO #1: Fix NodeManager Scope Lifecycle
Priority: CRITICAL
Effort: 2 hours
Agent Prompt:
1
2
3
4
5
6
Fix the orphaned CoroutineScope in NodeManager by:
1. Remove the private scope property
2. Accept a CoroutineScope as a constructor parameter
3. Update the lazy `nm` initialization to accept a scope from the caller
4. Ensure the scope is cancelled when the application shuts down
5. Update all callers to pass in their parent scope
TODO #2: Synchronize NodeManager.nodes Map
Priority: CRITICAL
Effort: 4 hours
Agent Prompt:
1
2
3
4
5
6
Add thread-safe access to NodeManager.nodes map:
1. Replace MutableMap with concurrent access using Mutex or synchronized blocks
2. Consider using kotlinx.atomicfu for atomic operations
3. Ensure all read/write operations are protected
4. Add unit tests to verify thread safety under concurrent access
5. Document the synchronization strategy
TODO #3: Fix NodeEventBus Thread Safety
Priority: HIGH
Effort: 2 hours
Agent Prompt:
1
2
3
4
5
6
Make NodeEventBus thread-safe:
1. Replace mutableListOf with a thread-safe collection (CopyOnWriteArrayList or use synchronization)
2. Add unsubscribe() method for cleanup
3. Update ServerLifecycleManager to unsubscribe on shutdown
4. Consider using StateFlow for event bus instead of custom implementation
5. Add documentation on subscriber lifecycle
TODO #4: Fix ServerSocketManager Thread Safety
Priority: HIGH
Effort: 1 hour
Agent Prompt:
1
2
3
4
5
Synchronize ServerSocketManager.sessions:
1. Use ConcurrentHashMap.newKeySet() or CopyOnWriteArraySet for sessions
2. Ensure broadcast() safely iterates over sessions
3. Add synchronized blocks if using regular MutableSet
4. Add error handling for session failures during broadcast
TODO #5: Add Observer Cleanup
Priority: HIGH
Effort: 2 hours
Agent Prompt:
1
2
3
4
5
Implement proper cleanup for NodeObserver:
1. Ensure ServerNodeObserver.close() is called on server shutdown
2. Add observer.close() call in ServerLifecycleManager.onStopped()
3. Verify all jobs in the jobs map are cancelled
4. Add logging to track observer lifecycle
TODO #6: Implement Missing RuleEngine Features
Priority: MEDIUM
Effort: 8-16 hours
Agent Prompt:
1
2
3
4
5
6
7
Implement missing RuleEngine features to match specifications:
1. Create RuleEngineWhen processors (CronTimer, IncomingWebHook)
2. Create Execute processors (OutgoingWebHook, ExecuteScript)
3. Implement RuleEngineCleanup
4. Add tests for each feature
5. Update documentation
6. Ensure proper coroutine scope management in processors
TODO #7: Remove Hardcoded Credentials
Priority: HIGH
Effort: 1 hour
Agent Prompt:
1
2
3
4
5
6
Replace hardcoded credentials with environment variables or configuration:
1. Replace "changeit" password in KtorConfig.kt with env var
2. Replace "changeit" password in MQTTBroker.kt with env var
3. Add documentation on how to configure credentials
4. Add validation that credentials are set
5. Consider using a secrets management solution
TODO #8: Add Synchronization to Job Maps
Priority: MEDIUM
Effort: 3 hours
Agent Prompt:
1
2
3
4
5
6
7
Add thread-safe access to all job maps:
1. ServerNodeObserver.jobs
2. ComputeEngineInternals.jobs
3. SilentTriggerManager.jobs
4. SerialDirectoryMonitor.jobs (already safe, single coroutine)
5. SerialFileMonitor.jobs
Use Mutex or concurrent collections as appropriate.
9. Strengths & Positive Observations
- Structured Concurrency: Good use of SupervisorJob for fault isolation
- Clean Architecture: Separation of concerns between server, SDK, and UI
- Multiplatform Design: Well-structured Kotlin Multiplatform setup
- Event-Driven: NodeEventBus provides decoupling (though needs thread safety)
- Error Handling: Try-catch blocks in critical paths
- Lifecycle Management: ServerLifecycleManager provides clear lifecycle hooks
- Plugin Architecture: KrillApp sealed class provides extensible feature system
- Flow Usage: Proper use of StateFlow for reactive state management
10. Testing Recommendations
Currently, no test infrastructure was found in the scoped modules. Recommendations:
- Add Unit Tests:
- NodeManager operations (thread safety)
- NodeEventBus subscribe/post/unsubscribe
- Engine implementations
- Add Integration Tests:
- Server startup/shutdown lifecycle
- WebSocket session management
- MQTT broker integration
- Add Coroutine Tests:
- Use
runTestfrom kotlinx-coroutines-test - Test scope cancellation behavior
- Test concurrent access scenarios
- Use
11. Monitoring & Observability
Recommendations:
- Add structured logging with correlation IDs
- Add metrics for:
- Active coroutine count per scope
- Node count in NodeManager
- Active WebSocket sessions
- MQTT client count
- Add health check endpoints
- Add graceful shutdown with timeout
Conclusion
The Krill platform demonstrates solid architectural foundations with good separation of concerns and multiplatform design. However, critical issues around coroutine lifecycle management and thread safety must be addressed before production deployment. The most urgent items are:
- Fix NodeManager orphaned scope (CRITICAL)
- Synchronize NodeManager.nodes map (CRITICAL)
- Fix NodeEventBus thread safety (HIGH)
- Remove hardcoded credentials (HIGH)
With these fixes, the platform will be significantly more robust and production-ready. The quality score could improve from 72/100 to 85+/100.
Next Steps:
- Address CRITICAL priority issues first
- Implement thread safety fixes
- Add comprehensive tests
- Complete RuleEngine implementation
- Add monitoring and metrics
- Conduct security audit
- Performance testing under load
Report Generated: 2025-12-03
Analyzed: 128 Kotlin files, ~2,747 lines of code
Modules: server, krill-sdk, shared, composeApp (desktop)