Transcript Slide 1
Toward Systems With a Will to Live SORNS Self Organizing Resilient Network Sensing 3Mar2010 This project is sponsored by the Department of Homeland Security under contract D10PC20039. The content of the material contained herein does not necessarily reflect the position or policy of the Government, and no official endorsement is implied. [email protected], 1 General Current Situation Adversarial Domain (AD) Adversarial AA AA Agent (AA) [email protected], 2 General Current Situation Adversarial Domain (AD) Adversarial AA AA Agent (AA) Adversarial Communities AD AD [email protected], 3 General Current Situation Adversarial Domain (AD) Adversarial AA AA Agent (AA) Security Domain (SD) Security SA SA Agent (SA) Adversarial Communities Security Communities AD AD [email protected], SD SD 4 General Current Situation Adversarial Domain (AD) Adversarial AA AA Agent (AA) Security Domain (SD) Security SA SA Agent (SA) Adversarial Communities Security Communities AD AD SD SD Static Artifact A A A A A A A Relatively Static Security and System Artifacts (A) A A A A A A A A [email protected], Static artifacts are systems with and without security measures, updated occasionally. A 5 General Current Situation Adversarial Domain (AD) Adversarial AA AA Agent (AA) Security Domain (SD) Security SA SA Agent (SA) Adversarial Communities Security Communities AD AD Dynamic Attack Dynamic attack includes human and systemic adaptive control preying upon fixed artifact defenses. SD SD Static Artifact A A A A A A A Relatively Static Security and System Artifacts (A) A A A A A A A A [email protected], Static artifacts are systems with and without security measures, updated occasionally. A 6 General Current Situation Adversarial Domain (AD) Adversarial AA AA Agent (AA) Security Domain (SD) Security SA SA Agent (SA) Adversarial Communities Security Communities AD AD Dynamic Attack Dynamic attack includes human and systemic adaptive control preying upon fixed artifact defenses. SD SD Static Artifact Relatively Static Security and System Artifacts (A) [email protected], Static artifacts are systems with and without security measures, updated occasionally. 7 Static System Dynamic Adversary [email protected], 8 Asymmetries Adversary is a natural system, security strategy is an artificial system Adversary leads with innovation and evolution Adversary self-organizes as a dynamic system-of-systems … up next … Three Dynamic Self Organizing System-of-System Security Patterns Pattern employment on the SORNS project [email protected], 9 Inspirational Patterns from natural systems that effectively process noisy sensory input from uncertain and changing environments Pattern: Bow Tie Processor (assembler/generator/mediator) V1 V: 123 Variable segments 123 Vs D1 D: J: 27 Ds Dn J1 6 Js Jn ~106 VDJ+VJ possible antigen detector shapes 27 Diverse segments 6 Joining segments Vr Available high variety genetic DNA input Vn Evolve three fixed V-D-J gene-segment libraries r Dr r from each Jr 1 random + random connect Fixed-rule VDJ assembly with random interconnects increases to ~109 varieties with addition of random nucleotide connections between VDJ & VJ joinings Random high variety output with VDJ + VJ assemblies Millions of random infection detectors generated continuously by fixed rules and modules in the “knot” Context: Complex system with many diverse inputs and many diverse outputs, where outputs need to respond to many needs or innovate for many or unknown opportunities, and it is not practical to build unique one-to-one connections between inputs and outputs. Appropriate examples include common financial currencies that mediate between producers and consumers, the adaptable biological immune system that produces proactive infection detectors from a wealth of genetic material, and the Internet protocol stack that connects diverse message sources to diverse message sinks. Problem: Too many connection possibilities between available inputs and useful outputs to build unique robust, evolving satisfaction-processes between each. Forces: Large knot short-term-flexibility vs small knot short-term-controllability and longterm-evolvability (Csete 2004); robustness to known vs fragility to unknown (Carlson 2002). Solution: Construct relatively small “knot” of fixed modules from selected inputs, that can be assembled into outputs as needed according to a fixed protocol. A proactive example is the adaptable immune system that constructs large quantities of random detectors (antigens) for unknown attacks and infections. A reactive example is a manufacturing line that constructs products for customers demanding custom capabilities. [email protected], 11 V--D--J Adaptable Immune System Bow-Tie Antigen-Detector Generator V--J Y cell body (generic agile-system architectural-concept diagram) detector antibody B-Cell Module Pools Integrity Management 123 V segments Module pools and mix evolution 27 D segments random nucleotides 6 J segments genetic evolution (module evolution) Module inventory condition ??repair mechanisms?? (inventory condition) Detector assembly bone marrow and thymus (system assembly) genetic evolution (rules evolution) Infrastructure evolution Active long chain short chain long chain short chain long chain short chain Infrastructure Passive detector sequence n detector sequence n+1 detector sequence n+2 Use one each V-J Use one each V-D-J Add random nucleotides Combine two assemblies Assembly Rules [email protected], 12 V--D--J Adaptable Immune System Bow-Tie Antigen-Detector Generator V--J Y cell body (generic agile-system architectural-concept diagram) detector antibody B-Cell systemic integrity-management processes Module Pools Integrity Management 123 V segments Module pools and mix evolution 27 D segments random nucleotides 6 J segments genetic evolution (module evolution) Module inventory condition ??repair mechanisms?? (inventory condition) Detector assembly bone marrow and thymus (system assembly) genetic evolution (rules evolution) Infrastructure evolution Active long chain short chain long chain short chain long chain short chain Infrastructure Passive detector sequence n detector sequence n+1 detector sequence n+2 Use one each V-J Use one each V-D-J Add random nucleotides Combine two assemblies Assembly Rules [email protected], 13 Pattern: Proactive Anomaly Search Speculative generation and mutation of detectors recognizes new attacks like a biological immune system Context: A complex system or system-of-systems subject to attack and infection, with low tolerance for attack success and no tolerance for catastrophic infection success; with resilient remedial action capability when infection is detected. Appropriate examples include biological organisms, and cyber networks for military tactical operations, national critical infrastructure, and commercial economic competition. Problem: Directed attack and infection types that constantly evolve in new innovative ways to circumvent in-place attack and infection detectors. Forces: False positive tradeoffs with false negatives, system functionality vs functionality impairing detection measures, detectors for anything possible vs added costs of comprehensive detection, comprehensive detection of attack vs cost of false detection of self. Solution: A high fidelity model of biological immune system antibody (detection) processes that generate high quantity and variety of anticipatory speculative detectors in advance of attack and during infection, and evolve a growing memory of successful detectors specific to the nature of the system-of-interest. [email protected], 14 Pattern: Hierarchical Sensemaking Four level feed forward/backward sense-making hierarchy modeled on visual cortex Context: A decision maker in need of accurate situational awareness in a critical dynamic environment. Examples include a network system administrator in monitoring mode and under attack, a military tactical commander in battle, and the NASA launch control room. Problem: A very large amount of low-level noisy sensory data overwhelms attempts to examine and conclude what relevance may be present, most especially if time is important or if sensory data is dynamic. Forces: amount of data to be examined vs time to reach a conclusion, number of ways data can be combined vs number of conclusions data can indicate, static sensory data vs dynamic sensory data, noise tolerated in sensory data vs cost of low noise sensory data. Solution: Using a bow-tie process, each level looks for a specific finite set of data patterns among the infinite possibilities of its input combinations, aggregating its input data into specific chunks of information. These chunks are fed-forward to the next higher level, that treats them in turn as data further aggregated into higher forms of information chunks. Through feedback, a higher level may bias a lower level to favor certain chunks over others, predicting what is expected now or next according to an emerging pattern at the higher level. Each level is only interested in a small number of an infinite set of data-combination possibilities, but as aggregation proceeds through multiple levels, complex data abstractions and recognitions are enabled. 15 [email protected], BIS Architecture (Biological Immune System) Antibody Creation & Life Cycle General antibody life cycle: creation, false-positive testing, deployment efficacy or termination, mutation improvement, and long-term memory. 1. 2. 3. 4. Candidate antibody semi-randomly created. Tolerization period tests immature candidates for false-positive matches. Mature & naïve antibodies put into time limited service. Activated (B-cell) antibodies need co-stimulation (by T-cells) to ensure “improvement” didn’t produce auto-reactive result, non-activated & non-co-stimulated candidates die when time limit ends. 5. Highest affinity co-stimulated antibodies are remembered for 1 time-limited long term (eg, many years, decades). 6. Co-stimulated antibodies are cloned with structured mutations, looking for improved (higher) affinity scores. 2 3 4 6 5 Diagram modified from (Hofmeyr 2000). [email protected], 17 Antibody Creation & Life Cycle General antibody life cycle: creation, false-positive testing, deployment efficacy or termination, mutation improvement, and long-term memory. 1. 2. 3. 4. Candidate antibody semi-randomly created. Tolerization period tests immature candidates for false-positive matches. Mature & naïve antibodies put into time limited service. Activated (B-cell) antibodies need co-stimulation (by T-cells) to ensure “improvement” didn’t produce auto-reactive result, non-activated & non-co-stimulated candidates die when time limit ends. 5. Highest affinity co-stimulated antibodies are remembered for 1 time-limited long term (eg, many years, decades). 6. Co-stimulated antibodies are cloned with structured mutations, looking for improved (higher) affinity scores. 2 Shape/Pattern Space ~109 3 4 6 5 Diagram modified from (Hofmeyr 2000). Self nonself discrimination: A universe of data points is partitioned into two sets – self and nonself. Negative detectors cover subsets of non-self. From (Esponda 2004) [email protected], 18 SORNS Application Self Organizing Resilient Network – Sensing Reconfigurable Pattern Processor Reusable Cells Reconfigurable in a Scalable Architecture Independent detection cell: content addressable by current input byte If active, and satisfied with current byte, can activate other designated cells including itself Cell-satisfaction output pointers Up to 256 possible features can be “satisfied” by all so-designated byte values Cell-satisfaction activation pointers Individual detection cells are configured into detectors by linking activation pointers. an unbounded number of detector cells configured as dectectors can extend indefinitely across multiple processors All active cells have simultaneous access to current data-stream byte [email protected], 20 Reconfigurable Pattern Processor Reusable Cells Reconfigurable in a Scalable Architecture Independent detection cell: content addressable by current input byte If active, and satisfied with current byte, can activate other designated cells including itself Cell-satisfaction output pointers Up to 256 possible features can be “satisfied” by all so-designated byte values Cell-satisfaction activation pointers Individual detection cells are configured into detectors by linking activation pointers. Enables High Fidelity Modeling an unbounded number of detector cells configured as dectectors can extend indefinitely across multiple processors All active cells have simultaneous access to current data-stream byte [email protected], 21 SORN-S Architecture Human Control Self Organizing Resilient Network - Sensing Human Control Level 4 Agents Human Interface Human Interface Architecture anticipates collaboration with other SORN by Level 3 agents Level 3 Agents Inter-SORN Collaboration Inter-SORN Collaboration // SORN-S Hardware Device Level 2 Agents Level 1 Agents Intra-SORN-S Collaboration EP Attack Detectors // EP Infection Detectors Intra-SORN-S Collaboration EP Attack Detectors EP Infection Detectors Multi-level architecture refines sensory input through learning and sensemaking hierarchy, supports remedial action agents (human/automated) with succinct relevant information. Notes: • For all-level hierarchical network-agent architecture general concept see (Haack 2009) • For hierarchical feed-forward/backward pattern learning, prediction, and sense-making see (George 2009). • For all-level hierarchical learning of causal patterns spread as time-sequence events see (Hawkins 2010). [email protected], 22 Level 1 & 2 Agent: Detector Creation & Learning Architecture General L1 detector life cycle: creation, false-positive testing, deployment efficacy or termination, mutation improvement, and long-term memory. 1. 2. 3. 4. Candidate fuzzy detector semi-randomly created. Tolerization period tests immature candidates for false-positive matches. Mature & naïve candidates put into time limited service. Activated (B-cell) candidates wait for co-stimulation (by T-cells) to ensure “improvement” didn’t produce auto-reactive result, non-activated & non-co-stimulated candidates die when time limit ends. 5. Highest scoring co-stimulated candidates are remembered for 1 time-limited long term. 6. Co-stimulated candidates are cloned with structured mutations, looking for improved (higher) activation scores. 2 7. Level 2 Agent insertion of activated candidates from other endpoints, and Level 2 Agent distribution of activated candidates to other end points. Other L2 Agent 3 7 L2 Agent 4 6 5 Other L2 Agent Diagram modified from (Hofmeyr 2000). [email protected], 23 Determining an AIS Approach: According to Jon Timmis Amelia Ritahani Ismail. 2008. On the Use of Modelling and Simulation for Granuloma Formation. Department of Computer Science, University of York, Final Qualifying Dissertation, Supervisor: Prof. Jon Timmis. www.cosmos-research.org/wiki/images/0/05/FINAL_QD.pdf The Structure of AIS – de Castro & Timmis (2002) proposed a layered framework for engineering AIS that takes the application domain as a starting point followed by three design layers that form the structure of AIS (refer to figure 2.6). These layers are: • the representation of the components of the systems and known as shape space • a set of mechanisms to evaluate the interactions of the individuals with the environment known as the affinity measures • the use of the algorithm (such as clonal selection, negative selection, immune network) which governs the behaviour of the systems de Castro & Timmis (2002) argue that the basis of every system is the application domain, therefore the way in which the components of the systems will be represented has to be considered. When the suitable representations have been decided, one or more affinity measures (such as Hamming and Euclidean distance) are used to quantify the interaction elements of the systems. Finally, the use of AIS algorithms or processes to govern the behaviour (dynamics) of the systems is needed. • Algorithms: generation, negative selection, clonal selection, immune network, etc • Affinity measures: the strength of a match between detector and intrusion signature. • Shape space: the quantity and nature of the detector features. • Application domain: Pattern processor technology applied to IP packets. [email protected], 24 Determining an AIS Approach: According to Jon Timmis Amelia Ritahani Ismail. 2008. On the Use of Modelling and Simulation for Granuloma Formation. Department of Computer Science, University of York, Final Qualifying Dissertation, Supervisor: Prof. Jon Timmis. www.cosmos-research.org/wiki/images/0/05/FINAL_QD.pdf The Structure of AIS – de Castro & Timmis (2002) proposed a layered framework for engineering AIS that takes the application domain as a starting point followed by three design layers that form the structure of AIS (refer to figure 2.6). These layers are: • the representation of the components of the systems and known as shape space • a set of mechanisms to evaluate the interactions of the individuals with the environment known as the affinity measures • the use of the algorithm (such as clonal selection, negative selection, immune network) which governs the behaviour of the systems de Castro & Timmis (2002) argue that the basis of every system is the application domain, therefore the way in which the components of the systems will be represented has to be considered. When the suitable representations have been decided, one or more affinity measures (such as Hamming and Euclidean distance) are used to quantify the interaction elements of the systems. Finally, the use of AIS algorithms or processes to govern the behaviour (dynamics) of the systems is needed. • Algorithms: generation, negative selection, clonal selection, immune network, etc • Affinity measures: the strength of a match between detector and intrusion signature. • Shape space: the quantity and nature of the detector features. • Application domain: Pattern processor technology applied to IP packets. [email protected], 25 Level 1 & 2 Agent: Detector Creation & Learning Architecture Eventually it became evident that a cloning General L1 detector life cycle: creation, false-positive testing, deployment efficacy or termination, mutation improvement, and long-term memory. improvement GA process was not necessary, as a 1. 2. 3. 4. speculative Candidate fuzzy detector semi-randomly created.fuzzy-detector match-event could simply snatch andfor record the signature Tolerization period tests immature candidates false-positive matches. that caused the event. Mature & naïve candidates put into time limited service. The realization: BIS works with complimentary Activated (B-cell) candidates wait for co-stimulation (by T-cells) to ensure “improvement” didn’t patterns rather than duplicate patterns, and can only produce auto-reactive result, non-activated & non-co-stimulated candidates die when time limit ends. create through trial and error a complimentary pattern 5. Highest scoring co-stimulated candidates are remembered for 1 with time-limited best fit. long term. 6. Co-stimulated candidates are cloned structured This departure from the BIS modelwith was initiallymutations, looking for improved activation uncomfortable, threw(higher) into doubt ourscores. high fidelity 2 7. Level 2 Agent insertion of activated candidates from other endobjective. We got over it. Coverage of pattern space points,quantity and Level and 2 Agent distribution activated candidates to (detector diversity) isofthe high fidelity other end points. issue, not process mirroring. Other L2 Agent 3 7 L2 Agent 4 6 5 Not necessary – new approach departs from a strict BIS model Diagram modified from (Hofmeyr 2000). [email protected], Other L2 Agent 26 Detection Categories (Types) Spatial Connection Spatial Content L1 packet headers (feasibility demo for TCP/UDP/ICMP). L1 deep packet inspection (feasibility demo for HTTP/SMTP???). Temporal Connection L2 multi-packet header detectors (feasibility demo for TCP/UDP/ICMP). Temporal Content L2 multi-packet content detectors (feasibility demo for HTTP/SMTP???). Rather than categorize by attack type (like UTMs do), of which there is a never-ending list as new types emerge, our categories relate to the detection philosophy: Automated learning of spatial and temporal pattern features within a fixed set of generic pattern structures. A syn flood attack, e.g. is one instance within the temporal-connection pattern category. • Spatial means a collection of features at one instant (packet/log-entry) in time that is detected as a pattern-of-interest. • Temporal means multi-packet/multi-log-entry features that are detected as a pattern. • Correlative (unaddressed here) means multi-flow/multi-log features detected as a pattern. [email protected], 27 Phase 1 Initial Focus IPv4 packet-header detection – single packet-header signature patterns (spatial connection category) Three elements to a pattern signature: address – port – type • Address: 4 bytes - Only the non-host address is of interest. • Port: 2 bytes - Only the destination port is of interest. • Types: 3 bits covers 8 types – (TCP, UDP, ICMP, other) x (incoming, outgoing) The L1-Agent preprocessor/controller selects relevant features from network packets and feeds them as condensed “feature packets” to the pattern processor L2 Agent L1 Agent network packets • conventional processor/memory • detector generator • feature packet assembly • pattern processor controller IPv4 feature packets detection alert [email protected], Pattern Processor • • • • special purpose chip detectors in nursery detectors in service detectors in memory 28 Feature Cells and Finite State Machines (Illustrative example of pattern processor capability) 7 multi-feature detectors “connected” as a finite state machine (FSM) port [email protected], ○○ ≈ ○○○○○○○○○○○○○○○○○○ ○○ ≈ ○○○○○○○○○○○○○○○○○○ ○○ ≈ ○○○○○○○○○○○○○○○○○○ IPv4 address ○○ ≈ ○○○○○○○○○○○○○○○○○○ ○○ ≈ ○○○○○○○○○○○○○○○○○○ If the index finds a set bit, the next MFD is activated and looks at the next stream byte, else the process dies. end ○○ ≈ ○○○○○○○○○○○○○○○○○○ All active MFDs are indexed by the input stream’s current byte value. start ○○ ≈ ○○○○○○○○○○○○○○○○○○ 256-bit associative memory multi-feature detectors (MFD). type 29 Feature Cells and Finite State Machines (Illustrative example of pattern processor capability) 7 multi-feature detectors “connected” as a finite state machine (FSM) port 0.118 [email protected], ○○ ≈ ○○○○○○○○○○○○○○○●○○ ○○ ≈ ○○○○○○○○○○○○○●○○○○ ○○ ≈ ○○○○○○○○○○○○○○○○○● IPv4 address 192.168.1.44 ○○ ≈ ○○○○○○○○●○○○○○○○○○ ○○ ≈ ○○○○○○○○○○○○○○○○●○ If the index finds a set bit, the next MFD is activated and looks at the next stream byte, else the process dies. end ○○ ≈ ○○○○○○○○○●○○○○○○○○ All active MFDs are indexed by the input stream’s current byte value. start ○○ ≈ ○○○○●○○○○○○○○○○○○○ 256-bit associative memory multi-feature detectors (MFD). Loaded with 7 values 192.168.1.44, 0.118, 2 type 2 30 Feature Cells and Finite State Machines (Illustrative example of pattern processor capability) 7 multi-feature detectors “connected” as a finite state machine (FSM) port [email protected], ○○ ≈ ○○○○○○○○○○○○○○○●○○ ○○ ≈ ○○○○○○○○○○○○○●○○○○ ○○ ≈ ○○○○○○○○○○○○○○○○○● IPv4 address ○○ ≈ ○○○○○○○○●○○○○○○○○○ ○○ ≈ ○○○○○○○○○○○○○○○○●○ If the index finds a set bit, the next MFD is activated and looks at the next stream byte, else the process dies. end ○○ ≈ ○○○○○○○○○●○○○○○○○○ All active MFDs are indexed by the input stream’s current byte value. start ○○ ≈ ○○○○●○○○○○○○○○○○○○ 256-bit associative memory multi-feature detectors (MFD). Processing Data Stream 192.168.1.44, 0.118, 2 type 31 Feature Cells and Finite State Machines (Illustrative example of pattern processor capability) 7 multi-feature detectors “connected” as a finite state machine (FSM) port [email protected], ○○ ≈ ○○○○○○○○○○○○○○○●○○ ○○ ≈ ○○○○○○○○○○○○○●○○○○ ○○ ≈ ○○○○○○○○○○○○○○○○○● IPv4 address ○○ ≈ ○○○○○○○○●○○○○○○○○○ ○○ ≈ ○○○○○○○○○○○○○○○○●○ If the index finds a set bit, the next MFD is activated and looks at the next stream byte, else the process dies. end ○○ ≈ ○○○○○○○○○●○○○○○○○○ All active MFDs are indexed by the input stream’s current byte value. start ○○ ≈ ○○○○●○○○○○○○○○○○○○ 256-bit associative memory multi-feature detectors (MFD). Processing Data Stream 192.168.1.44, 0.118, 2 type 32 Feature Cells and Finite State Machines (Illustrative example of pattern processor capability) 7 multi-feature detectors “connected” as a finite state machine (FSM) port [email protected], ○○ ≈ ○○○○○○○○○○○○○○○●○○ ○○ ≈ ○○○○○○○○○○○○○●○○○○ ○○ ≈ ○○○○○○○○○○○○○○○○○● IPv4 address ○○ ≈ ○○○○○○○○●○○○○○○○○○ ○○ ≈ ○○○○○○○○○○○○○○○○●○ If the index finds a set bit, the next MFD is activated and looks at the next stream byte, else the process dies. end ○○ ≈ ○○○○○○○○○●○○○○○○○○ All active MFDs are indexed by the input stream’s current byte value. start ○○ ≈ ○○○○●○○○○○○○○○○○○○ 256-bit associative memory multi-feature detectors (MFD). Processing Data Stream 192.168.1.44, 0.118, 2 type 33 Feature Cells and Finite State Machines (Illustrative example of pattern processor capability) 7 multi-feature detectors “connected” as a finite state machine (FSM) port [email protected], ○○ ≈ ○○○○○○○○○○○○○○○●○○ ○○ ≈ ○○○○○○○○○○○○○●○○○○ ○○ ≈ ○○○○○○○○○○○○○○○○○● IPv4 address ○○ ≈ ○○○○○○○○●○○○○○○○○○ ○○ ≈ ○○○○○○○○○○○○○○○○●○ If the index finds a set bit, the next MFD is activated and looks at the next stream byte, else the process dies. end ○○ ≈ ○○○○○○○○○●○○○○○○○○ All active MFDs are indexed by the input stream’s current byte value. start ○○ ≈ ○○○○●○○○○○○○○○○○○○ 256-bit associative memory multi-feature detectors (MFD). Processing Data Stream 192.168.1.44, 0.118, 2 type 34 Feature Cells and Finite State Machines (Illustrative example of pattern processor capability) 7 multi-feature detectors “connected” as a finite state machine (FSM) port [email protected], ○○ ≈ ○○○○○○○○○○○○○○○●○○ ○○ ≈ ○○○○○○○○○○○○○●○○○○ ○○ ≈ ○○○○○○○○○○○○○○○○○● IPv4 address ○○ ≈ ○○○○○○○○●○○○○○○○○○ ○○ ≈ ○○○○○○○○○○○○○○○○●○ If the index finds a set bit, the next MFD is activated and looks at the next stream byte, else the process dies. end ○○ ≈ ○○○○○○○○○●○○○○○○○○ All active MFDs are indexed by the input stream’s current byte value. start ○○ ≈ ○○○○●○○○○○○○○○○○○○ 256-bit associative memory multi-feature detectors (MFD). Processing Data Stream 192.168.1.44, 0.118, 2 type 35 Feature Cells and Finite State Machines (Illustrative example of pattern processor capability) 7 multi-feature detectors “connected” as a finite state machine (FSM) port [email protected], ○○ ≈ ○○○○○○○○○○○○○○○●○○ ○○ ≈ ○○○○○○○○○○○○○●○○○○ ○○ ≈ ○○○○○○○○○○○○○○○○○● IPv4 address ○○ ≈ ○○○○○○○○●○○○○○○○○○ ○○ ≈ ○○○○○○○○○○○○○○○○●○ If the index finds a set bit, the next MFD is activated and looks at the next stream byte, else the process dies. end ○○ ≈ ○○○○○○○○○●○○○○○○○○ All active MFDs are indexed by the input stream’s current byte value. start ○○ ≈ ○○○○●○○○○○○○○○○○○○ 256-bit associative memory multi-feature detectors (MFD). Processing Data Stream 192.168.1.44, 0.118, 2 type 36 Feature Cells and Finite State Machines (Illustrative example of pattern processor capability) 7 multi-feature detectors “connected” as a finite state machine (FSM) port [email protected], ○○ ≈ ○○○○○○○○○○○○○○○●○○ ○○ ≈ ○○○○○○○○○○○○○●○○○○ ○○ ≈ ○○○○○○○○○○○○○○○○○● IPv4 address ○○ ≈ ○○○○○○○○●○○○○○○○○○ ○○ ≈ ○○○○○○○○○○○○○○○○●○ If the index finds a set bit, the next MFD is activated and looks at the next stream byte, else the process dies. end ○○ ≈ ○○○○○○○○○●○○○○○○○○ All active MFDs are indexed by the input stream’s current byte value. start ○○ ≈ ○○○○●○○○○○○○○○○○○○ 256-bit associative memory multi-feature detectors (MFD). Processing Data Stream 192.168.1.44, 0.118, 2 type 37 Very Large Scale Anomaly Detector Patent Pending 98 126 25 41 250 7 98 126 25 41 11 0 Fundamental Elements pattern pattern list 255 255 255 255 255 255 16 13 123 255 0 0 255 255 255 255 255 254 255 255 255 255 255 253 255 255 255 255 255 pattern space 0 255 255 255 255 254 255 255 255 255 255 254 254 255 255 255 255 254 253 255 255 255 255 254 0 feature packet 12 98 126 25 41 250 7 0 0 0 0 0 pattern space contains 2566 = 2.81x1014 unique patterns for patterns consisting of 6 feature values with range 0-255 0 feature value 16 13 123 255 0 0 98 1 feature stream [email protected], 39 Gang Detector (GD) Pattern (or Pattern Path) Non-Feature Indicator (0-bit) Multi-Feature Detectors (MFD, variable size) 10101111 11 ≈ 001010110100101010 00 ≈ 001111110010101011 10 ≈ 001100110110111100 11 ≈ 110011100011101111 01 ≈ 011111110110101001 10 ≈ 001001110010101111 Feature Indicator (1-bit) Gang Detector (eg, with seven multi-feature detectors) [email protected], 40 Gang Detector (GD) A GD is implemented as a 2 dimensional bit array, with each column corresponding to an MFD of independent size, but typically a max of 256 to accommodate associative addressing (indexing) by an 8bit Feature Packet byte. Pattern (or Pattern Path) A Feature Indicator is a 1-bit in any or all of the possible index values. Non-Feature Indicator (0-bit) Multi-Feature Detectors (MFD, variable size) One GD with all Feature Indicators present (all bits set to 1) would have 256x256x256x256x256x256x8 = 2.6x1015 unique Pattern Paths. This many unique patterns would be represented in just (6x32)+1 = 10101111 11 ≈ 001010110100101010 00 ≈ 001111110010101011 10 ≈ 001100110110111100 11 ≈ 110011100011101111 01 ≈ 011111110110101001 10 ≈ 001001110010101111 Feature Indicator (1-bit) 193 8-bit data bytes. If each of these patterns were in a pattern list, seven times the number of possible patterns in data bytes would be required, Gang Detector (eg, with seven multi-feature detectors) [email protected], ~1016 data bytes in contrast. 41 Gang Detector (GD) A GD is implemented as a 2 dimensional bit array, with each column corresponding to an MFD of independent size, but typically a max of 256 to accommodate associative addressing (indexing) by an 8bit Feature Packet byte. Pattern (or Pattern Path) A Feature Indicator is a 1-bit in any or all of the possible index values. Non-Feature Indicator (0-bit) Multi-Feature Detectors (MFD, variable size) One GD with all Feature Indicators present (all bits set to 1) would have 256x256x256x256x256x256x8 = 2.6x1015 unique Pattern Paths. This many unique patterns would be represented in just (6x32)+1 = 10101111 11 ≈ 001010110100101010 00 ≈ 001111110010101011 10 ≈ 001100110110111100 11 ≈ 110011100011101111 01 ≈ 011111110110101001 10 ≈ 001001110010101111 Feature Indicator (1-bit) 193 8-bit data bytes. If each of these patterns were in a pattern list, seven times the number of possible patterns in data bytes would be required, Gang Detector (eg, with seven multi-feature detectors) ~1016 data bytes in contrast. a unique benefit of the approach [email protected], 42 Gang Detector (GD) 00001000 00 ≈ 000000000000001000 00 ≈ 000000000000100000 00 ≈ 000000000000100000 00 ≈ 000001000000000000 00 ≈ 000000000100000000 00 ≈ 001000000000000000 7 Feature Indicators = 1 Pattern (Path) 43 [email protected], Gang Detector (GD) 00001000 00 ≈ 000000000000001000 00 ≈ 000000000000100000 00 ≈ 000000000000100000 00 ≈ 000001000000000000 00 ≈ 010000000100000000 00 ≈ 001000000000000000 7 Feature Indicators = 1 Pattern (Path) 8 Feature Indicators = 2 Patterns (Paths) 44 [email protected], Gang Detector (GD) 00001000 00 ≈ 000000000000001000 00 ≈ 000000000000100000 00 ≈ 000000100000100000 00 ≈ 000001000000000000 00 ≈ 010000000100000000 00 ≈ 001000000000000000 7 Feature Indicators = 1 Pattern (Path) 8 Feature Indicators = 2 Patterns (Paths) 9 Feature Indicators = 4 Patterns (Paths) [email protected], 45 Gang Detector (GD) Adding a single Feature Indicator increases the Patterns (Paths) by a factor of 2, an exponential increase. If that were 50%, with six MFDs of size 256 and one of size 8, the total number of Patterns (Paths) upon creation would be 00001000 00 ≈ 000000000100001000 00 ≈ 000000000000100000 00 ≈ 000000100000100000 00 ≈ 000001000000000000 00 ≈ 010000000100000000 00 ≈ 001000000000000000 An application might create a new GD with the same percentage of Feature Indicators in every MFD. 7 Feature Indicators = 1 Pattern (Path) 8 Feature Indicators = 2 Patterns (Paths) 9 Feature Indicators = 4 Patterns (Paths) 10 Feature Indicators = 8 Patterns (Paths) 128x128x128x128x128x128x4= 1.8x1013 patterns Detectable at data stream feed speed independent of the number of patterns [email protected], 46 Gang Detector (GD) Adding a single Feature Indicator increases the Patterns (Paths) by a factor of 2, an exponential increase. If that were 50%, with six MFDs of size 256 and one of size 8, the total number of Patterns (Paths) upon creation would be 00001000 00 ≈ 000000000100001000 00 ≈ 000000000000100000 00 ≈ 000000100000100000 00 ≈ 000001000000000000 00 ≈ 010000000100000000 00 ≈ 001000000000000000 An application might create a new GD with the same percentage of Feature Indicators in every MFD. 7 Feature Indicators = 1 Pattern (Path) 8 Feature Indicators = 2 Patterns (Paths) 9 Feature Indicators = 4 Patterns (Paths) 10 Feature Indicators = 8 Patterns (Paths) 128x128x128x128x128x128x4= 1.8x1013 patterns Detectable at data-stream feed-speed independent of the number of patterns a unique benefit of the approach [email protected], 47 Gang Detector Function and Operation GD Creation Create a new Gang Detector (GD) GD Maturation Mature new GD in the Nursery GD Insertion Insert mature GD into Service GD Detection Use GDs to detect anomalies GD Removal Remove GDs from Service Nursery Set GDN1 GDN2 GDNn Service Set GDS1 GDS2 Memory Set GDSm DM1 DM2 DMm (single pattern detectors, not GDs) [email protected], 48 Gang Detector Function and Operation GD Creation Create a new Gang Detector (GD) GD Maturation Mature new GD in the Nursery GD Insertion Insert mature GD into Service GD Detection Use GDs to detect anomalies GD Removal Remove GDs from Service Nursery Set GDN1 GDN2 GDNn Multiple gang detectors covering slightly-overlapping portions of total pattern space collectively increase the total coverage of pattern space. 50% of Feature Indicators set across 7 MFDs (6@256 & 1@8) 99.97% coverage with 512 GDs Service Set GDS1 GDS2 Memory Set GDSm DM1 DM2 DMm (single pattern detectors, not GDs) [email protected], 49 Gang Detector Function and Operation GD Maturation GD Detection Start at next Feature Packet in a Feature Stream Start at next Feature Packet in a Feature Stream For every Feature Value in a Feature Packet… For every Feature Value 16 in Feature Packet… Do corresponding Multi-Feature Detectors contain a Feature Indicator at the location indexed by the Feature Value 16? Do corresponding Multi-Feature Detectors contain a Feature Indicator at the location indexed by the Feature Value? NO YES YES NO Feature Packet finished? YES Choose one MFD in GD Remove corresponding Feature Indicator done NO NO Maturation and detection happen in parallel, processing the same Feature Packet Stream. Thus a single Feature Packet Stream is used for maturing new GDs in the Nursery Set and detecting anomalies in the Service Set, simultaneously. [email protected], Feature Packet finished? YES Indicate that a Pattern Path match has occurred 50 IPv4 Pattern-Space Coverage: c=6 % Coverage by number of GDs Pattern-Space Coverage by Number of GDs 100,00% 90,00% 80,00% 70,00% 60,00% 50,00% 40,00% 50% cardinality 30,00% 20,00% 10,00% 1 21 41 61 81 101 121 141 161 181 201 221 241 261 281 301 321 341 361 381 401 421 441 461 481 501 0,00% 100,00% 90,00% 80,00% 70,00% 60,00% 50,00% 40,00% 30,00% 20,00% 10,00% 0,00% 99,35% 99,76% 99,91% 99,97% 95,14% 98,23% 86,68% 63,50% 64 128 192 256 320 384 448 512 IPv6 Pattern-Space Coverage: c=32 Pattern-Space Coverage by Number of GDs % Coverage by number of GDs 100,00% 90,00% 80,00% 70,00% 60,00% 50,00% 40,00% 85% cardinality 30,00% 20,00% 10,00% 1 21 41 61 81 101 121 141 161 181 201 221 241 261 281 301 321 341 361 381 401 421 441 461 481 501 0,00% 100,00% 90,00% 80,00% 70,00% 60,00% 50,00% 40,00% 30,00% 20,00% 10,00% 0,00% [email protected], 84,69% 95,03% 89,48% 92,77% 77,71% 67,56% 52,79% 31,29% 64 128 192 256 320 384 448 512 51 Coverage as Function of Cardinality accelerating decline in coverage as cardinality drops, 40% thought comfortable threshold Cardinality losses justify the value of refresh-cycling the in-service GDs, and sharing results with other endpoint agents [email protected], 52 Coverage of 32 MFDs Declines Fast 6 MFDs at 40% = 98.35% at 1024 GDs [email protected], 53 Learning Curve Comparing Two Training Methods range expanded 6x [email protected], 54 Very Large Scale Anomaly Detector – Other App Domains The SORNS example is a specific domain instance of the more general “normal vs. anomalous behavior” classification problem. Of interest elsewhere, for instance: • monitoring the operational behaviors of a swarm of unmanned autonomous weapons sent on a war fighting mission, • video image monitoring of human behavior in terrorist target areas like airports, • insider threat behavior, • monitoring machine and process behaviors in factories (anti-Stuxnet), • monitoring critical infrastructure operating behaviors, say for financial-market transactions. Perhaps most interesting, the human brain appears to work on anomaly detection • it appears to select and store (learn) pattern features for uniqueness and rarity • children learn with attention to things that are different and new • it doesn’t learn quickly, but does recognize patterns “immediately” • it can learn both by download (taught) and by speculative discovery [email protected], 55 JTRS Network Testing The testing for sufficient device security can never end, because the attacks on security evolve rapidly with intent to breech the device in new effective ways. Something “transparent” is needed on board every device that can monitor its behavior continuously for abnormalities, signal that something is amiss, and provide mediation options and field-operational data. Transparency is key in multiple dimensions: • Does not require program intervention (e.g., no action required by anybody except test personnel) • Does not interfere with or degrade device functional performance. • Does not rely on external support services (e.g., McAfee daily updates). • Does not require external control or collaboration (e.g., black box self sufficient). Physical Configuration Phases P1: sniff the host traffic with a (centralized) stand alone wireless device located anywhere in wireless range. P2: sniff host traffic on-board the host (physically attached). P3: integrated with host architecture and can take direct mediation action. P1 P2 End-Point Monitor Network Interface End-Point Monitor End-Point Monitor Test Central Network Interface P3 Network Interface End-Point Monitor End-Point Monitor System Functions System Functions Radio/Device Radio/Device [email protected], 56 One possible implementation style: A Bump on the Network Wire …also wireless and USB filter The InZero® Security Platform provides a new approach to protecting PCs from malware. Rather than scanning suspect files with a virus scanner that looks for an ever-increasing number of malware signatures, the Gateway’s security architecture is based on a hardware sandbox that can safely execute malicious applications without damaging the Gateway or the PC. The hardware sandbox is shown in red because it permits the sandbox applications (e.g., browser) to safely execute arbitrarily dangerous application data files from the internet and from the PC. However, the design of the sandbox prevents any malware in these data files from modifying the Gateway’s software or writing any data used by the operating system. Physical separation prevents the malware from infecting your PC. [email protected], 57 Very Large Scale Anomaly Detector - Summary Gang Detector (seven-feature pattern example) • Memory breakthrough: 193 bytes vs. 1016 bytes for pattern storage • Coverage breakthrough: 512 GDs covers 99.97% of pattern space at 1 endpoint • Good coverage does not require high coverage at any endpoint: • Endpoint multiples boost total coverage with same coverage curve • Endpoint refresh boosts total coverage with same coverage curve • Custom anomaly detection – no two installations alike (in pattern content) • Dynamically adaptable to local situation Pattern Processor without tradeoffs • Speed breakthrough: speed independent of number/size of patterns • Capacity breakthrough: affordable massive pattern counts Potential Product Package • Transparent to endpoint: no latency or throttling of comm speed • Transparent to network: no accommodations required • Transparent to budget: no signature subscription • Transparent to installer: bump on the wire installation [email protected], 58 A Fairer Fight Adversarial Domain (AD) Adversarial AA AA Agent (AA) Security Domain (SD) Security SA SA Agent (SA) Adversarial Communities Security Communities AD AD Dynamic Attack Dynamic attack includes human and systemic adaptive control – unable to know the detection detail and capability. SD SD Self Organizing Systems SORNS Net Domain Level-2 L2 Agent (L2) SORNS Communities L3 L2 L3 Anomaly Sensors collaborate within a network, dynamically adjusting to local situations, and can also collaborate among networks. Just one self-organizing security example of many more to come [email protected], 59 Other Aware-Systems Domains applications in non-cyber domains sensors are proliferating – sensemaking needs attention [email protected], 60 Systems of Systems – Always There, Now Aware 26Jan2011, Ford Previews Vehicle-To-Vehicle Tech At Washington Auto Show A dedicated short-range WiFi system on a secure channel one-ups radar safety systems by allowing full 360-degree coverage even when there's no direct line of sight. Vehicle-to-vehicle warning systems could address nearly 80 percent of reported crashes not involving drunk drivers. Prototypes to tour the U.S. this spring. V2V tech could move to [email protected], devices, bringing the capability to all cars 61 Automated Disease Surveillance ESSENCE (Lombardo and Buckeridge 2007) Electronic Surveillance System for the Early Notification of Community-based Epidemics Joseph Lombardo, Sheri Lewis, Rich Wojcik, and Wayne Loschen. 2010. Systems Engineering to Support Discovery of Threats to Public Health. INSIGHT 13(4), December. [email protected], Requirements Challenges The requirements include a significant reduction in the time for the recognition of a significant health event so that an effective public-health response can be launched to minimize casualties. The system must be able to recognize emerging health risks for both naturally occurring infectious diseases and those that are intentional. False positives must be kept to a minimum, and periodic evaluation is needed to ensure that the system provides the performance needed. For an automated diseasesurveillance system to achieve timely positive detection of a covert attack of weaponized bacillus anthracis spores, the system must rely on the behaviors of individuals manifesting early symptoms of the disease. The system may also need to rely on other sources of information not used in a clinical setting. 62 Smart Roads. Smart Bridges. 7Feb2009, WSJ, http://online.wsj.com/article/SB123447510631779255.html SMART MOVES Freeway signs give estimated travel times and other information on Interstate Highway 80 in the San Francisco Bay area. Radio receivers are installed along several freeways in the San Francisco Bay area that read the electronic toll tags in passing cars. One promising avenue: real-time information about road conditions, traffic jams and other events. The next generation of technologies promises to get that news -- and even more detailed information -- directly to drivers in their cars. Gi-Lu Cable-Stay Bridge in Taiwan has wireless sensors and accelerometers that monitor its structural health. www.scientificamerican.com/slideshow.cfm?id=smart-bridges-harness-tech&photo_id=EF0F0341-A452-806C-8CDD679765CAA94B [email protected], © C.H. LOH, NATIONAL TAIWAN UNIVERSITY AND JEROME LYNCH, UNIVERSITY OF MICHIGAN AT ANN ARBOR 63 Fractal at national transmission and local distribution levels. Securing the Smart Grid: Next Generation Power Grid Security Chapter 2 provides an overview of the threats and impacts of smart metering at the consumer level. With the benefits come security and privacy issues. Leaving security to vendors of homebased products has traditionally not been met with much success. Chapter 4 notes … An important group working on this is the NIST Cyber Security Working Group (CSWG). The primary goal of the CSWG is to develop an overall cyber security strategy for the smart grid. This strategy addresses prevention, detection, response, and recovery. The CSWG recently created NISTIR 7628 — Guidelines for Smart Grid Cyber Security. Review: samzenpus, 5Jan2010, http://books.slashdot.org/story/11/01/05/1251224/Securing-the-Smart-Grid?from=headlines Smart grids are a reality and the future, and they promise greater reliability, affordability, efficiency and, hopefully, a better and environmentally cleaner exploitation of available resources. But all that brings to light new threats to the grids. What does that entail and what can we do to defend them - these are the two main questions that this book offers the answers to. You'll find out more about the threats to smart grids: natural, individual and organizational threats. A special part of the chapter is dedicated to the hacker threat and its various incarnations and motives. The impacts of these threats on utility companies and others are next, with various believable scenarios that point out the threats, their attack vectors and their impacts. From threats to individuals to those to entire countries - it makes you realize what the danger really is. Review: Zeljka Zorz, 6Dec2010, www.net-security.org/review.php?id=240 [email protected], 64 Physical Security & Smart Buildings Sensors on the property, at the building entry ways, within the building. IR motion, video, sound, pressure, …. ID entry ways (key card, fingerprint, iris, facial recognition…) RFID individual-human tracking within a building [email protected], 65 Assessing Fleet Characteristics Useful for Sensing Michael J. Ravnitzky, Offering Sensor Network Services Using the Postal Delivery Vehicle Fleet, 18th Conference on Postal and Delivery Economics, Center for Research in Regulated Industries, Porvoo, Finland, June 2-5, 2010 http://www.prc.gov/(S(je2vev45qxfhtai2g3npcczd))/prc-docs/newsroom/techpapers/Ravnitzky Postal Sensors Paper 070910-MJR-1_1191.pdf Fleet Type Single Time Regular National on the Routes Owner Road Universal Geographic Centralized Geographic Flexibility/ Maintenance Coverage Selectivity Taxis X Police Cars X X X X City Buses X School Buses X X City Fleet X UPS/FedEx X Limited X X X Limited Postal Trucks X X X X X X From Slashdot: The US Postal Service may face insolvency by 2011 (it lost $8.5 billion last year). An op-ed piece in the December 17, 2010 New York Times proposed an interesting business idea for the Postal Service: use postal trucks as a giant fleet of mobile sensor platforms.Think Google Streetview on steroids. The trucks could be outfitted with a variety of sensors (security, environmental, RF ...) [email protected], 66 Potential Applications for Postal Truck-Borne Mobile Sensors Michael J. Ravnitzky, Offering Sensor Network Services Using the Postal Delivery Vehicle Fleet, 18th Conference on Postal and Delivery Economics, Center for Research in Regulated Industries, Porvoo, Finland, June 2-5, 2010 http://www.prc.gov/(S(je2vev45qxfhtai2g3npcczd))/prc-docs/newsroom/techpapers/Ravnitzky Postal Sensors Paper 070910-MJR-1_1191.pdf Application Description Chemical Agents Biological Agents Radiological Materials Air Quality Environmental Sensing Radio/Television Signal Strength Wireless Signal Strength Weather/ Meteorological Pothole Mapping/Road Assessment Natural Gas Leaks License Plate Scanning Methamphetamine Labs Marijuana Farms/ Drug Depots Illicit Explosives Production Photo Imaging Noise Profiling/ Acoustic Signature Pest Control Biological Surveys Nuclear Radiation Leaks Electric Field Mapping Magnetic Field Mapping Other Scientific Investigation Meter Reading Likely Customer Base DHS, States DHS, States DOE, DHS, States EPA, States, Cities EPA, States, USDA, Cities FCC, Telecoms FCC, Telecoms National Weather Service Public Works Departments Gas Utilities Law Enforcement Law Enforcement Law Enforcement Law Enforcement Google, Law Enforcement, Local Governments Zoning, Cities, Research State, County Governments Scientific Community NRC, Utilities EPA, Cities, Scientific Community EPA, Cities, Scientific Community DoD, DOE, Scientific Community, Universities [email protected], 67 Personal weather station is alien chic "Weather information from thousands of personal weather stations are being used for weather forecasting by several private and government agencies, including the National Oceanic and Atmospheric Administration (NOAA) and the Dept. of Homeland Security (DHS). The Citizens Weather Observation Program (CWOP) was created by a few amateur radio operators experimenting with transmitting weather data with packet radios, but it has expanded to include Internet-only weather stations as well. As of September 2007, nearly 5,000 stations worldwide reported weather data regularly to CWOP's FindU database. In Feb 2007 (http://www.pnl.gov/main/publications/external/technical_reports/PNNL-16422.pdf) DHS listed CWOP as a national asset to the 'BioWatch' Network, stating that data from personal weather stations could be useful in weather forecasts for hazardous releases. In 2007, the FindU server received 422,262,687 weather reports which is a 29.5% increase over 2006. [http://science.slashdot.org/article.pl?sid=08/01/19/1835237] No longer do home forecasting gadgets look like hospital equipment thanks to Oregon Scientific's efforts to add an aesthetic dimension to its products. Its latest offering looks more like a retro sci-fi movie prop than something used to guess whether you should bring a sweater to the picnic. [http://crave.cnet.com/8301-1_105-9842340-1.html] [email protected], 68 Pothole Sensors in Every Car www.boston.com/news/local/massachusetts/articles/2011/02/09/weapons_in_the_battle_vs_potholes/ 9Feb2011: The City of Boston is releasing an app called Street Bump that uses the accelerometer in your smartphone to automatically report bumps in the road as you drive over them. The application relies on two components embedded in iPhones, Android phones, and many other mobile devices: the accelerometer and the Global Positioning System receiver. The accelerometer, which determines the direction and acceleration of a phone’s movement, can be harnessed to identify when a phone resting on a dashboard or in a cupholder in a moving car has hit a bump; the GPS receiver can determine by satellite just where that bump is located “We’re constantly looking for new ways to make sure that roads are as smooth as they possibly can be, and we believe that Street Bump is a first-in-the nation app,’’ said Chris Osgood, one half of the two-man Urban Mechanics office. Osgood and Nigel Jacob, New Urban Mechanics cochairman, have developed the prototype with Fabio Carrera, a professor at Worcester Polytechnic Institute, and Joshua Thorp and Stephen Guerin, developers from the Santa Fe Complex, a civic-minded technology and design think tank in New Mexico. “It’s a new kind of volunteerism,’’ Jacob said. “It’s not volunteering your sweat equity. It’s volunteering the devices that are in your pocket to help the city.’’ [email protected], 69 [email protected], IBM Tivoli 70