Software escrow
- 3rd party that stores source code, compiled code, manual
- hands over stored data to the customer only on specific conditions
- protects customer from software developer going bankrupt ≡ absense of patches
Testing
- fuzzing: random input
- static analyzer
- valgrind: memory leak
Hadoop
- node types
- NameNode:
- coordinate data movement: map where every block is stored and where replicated
- active/standby
- block size: 64 MB, 128 MB
- DataNode: store data
- does not use RAID
- NameNode:
- MapReduce
- batch processing, not real-time
- query is broken into smaller tasks and distributed across nodes
- high I/O to disks ≡ latency
Apache Spark
- in-memory
- Spark Streaming: real-time processing
- integrates with messaging system (e.g., Kafka)
- DStream: discretized stream ≡ microbatch