CSCU9YS-Exam-Notes

CSCU9YS Exam Notes. Read here ->

View on GitHub

If you’re seeing ugly html here, please visit the the website, as linked above. GitHub default readme parsing won’t support math formulas, and this document makes use of them.

Security And Forensics

Table of Contents

Adversaries and Attacks

aka How to become a scammer in 419 steps.

Adversaries

Attacks

Secure System Requirements

A secure system is comprised of:

Privacy

Privacy is “a state of being free from being observed or disturbed by others”.

Privacy in the society

Prior to the internet days, it was hard to find data (especially personal), because of the difficulty of obtaining it. Now it is extremely easy do surveillance on someone, from both individuals and organisations.

The concept of privacy varies from country to country and person to person.

In the E.U. the personal data is owned by you, and companies are not allowed to sell it. Conversely, in the U.S. the company owns it, and it can be sold to other companies (for profit, marketing, etc.).

It is important to be aware of privacy issues, as our decision might shape what will be of our personal information in the future. The E.U. seems to be on the right way for now, good job us! :)

Data Protection Act (DPA)

The Data Protection Act is designed to protect citizens’ privacy and enforce how organisations handle information relating to them.

The DPA can be summarised in eight principles:

Regulation of Investigatory Act (RIPA)

In contrast with DPA, RIPA allows government agencies to monitor citizen’s activities, if they are suspected of a crime. It imposes the ISP to keep browsing history, and allows said agencies to access it at any time.

General Data Protection Regulation (GDPR)

GDPR is a regulation in EU law that:

Data mining

Big software companies like Google and Facebook make most of their profit by using user’s data to make targeted advertisements, and selling the data to interested parties.

Anonymity

The situation in which someone’s name is not given or known.

Anonymity can be useful to guarantee the privacy of users, on the other hand it can be easily exploited by people with malicious intent.

Authentication

Authentication is the process of identifying an individual as being genuine and not an impostor.

Common types of authentication mechanisms (identity authentication) checks if an individual:

Authenticating an individual is not particularly simple, on the other hand, as it relies on documents and from other people.

Authentication might affect your privacy, as it is simple to collect data from multiple sources that can be linked to a specific individual (credit cards, loyalty cards, bus cards, etc.)

Methods

Passwords

Passwords and PINS (Personal Identification Number) are are a method of identification that require an individual to remember something to prove that they are the associated user.

Passwords are often common words, and can be mine from breached databases. They are often reused, causing security issues, if one of the services has leaked information.

Passwords are usually stored using a hash function in a file. This will prevent the actual password to be disclosed in the file.

Possible attacks include:

One-Time Passwords

The system asks the user to insert a one time password that is generated by another device (dedicated, or mobile phone). This is often used in combination with a password (2-factor authentication).

This was initially only used by online banking, now all major services offers it as an additional layer of security.

Biometrics

Biometrics are an automated method for recognising individuals based on measurable biological and behavioural characteristics.

They don’t require the user to remember anything, and are often considered safer than password.

Issues
Biometric authentication:

Identification vs Verification

Identification and verification are two types of authentication.

We’ll take in account the example of biometric identification and verification.

Identification vs Verification

Verification systems seek to answer the question “Is this person who they say they are?” Under a verification system, an individual presents himself or herself as a specific person. The system checks his or her biometric against a biometric profile that already exists in the database linked to that person’s file in order to find a match.

Verification systems are generally described as a 1-to-1 matching system because the system tries to match the biometric presented by the individual against a specific biometric already on file.

Identification systems are different from verification systems because an identification system seeks to identify an unknown person, or unknown biometric.

The system tries to answer the questions “Who is this person?” and must check the biometric presented against all others already in the database. Identification systems are described as a 1-to-n matching system.

Access Rights

Identifying what a user can do is as important as important as identifying who they are in the first place.

If a breach occurs for a standard user, the damage should be limited.

It is assumed that system administrator are much more careful in handling their authentication process.

Nowadays operating systems have an Access Control List that maps the rights that each user has towards a specific object (file).

Unix systems have a very strict and granular permission policy, with permission groups and different flags. This is because they were designed to be multi-users operating systems from the start.

Databases have similar access control lists, with users and groups. When properly designed, a database should only allow the minimum required rights to a user. For instance, a client that is only required to read data from a table, should only have rights to read that table, and nothing else.

Virtual memory per user was introduced, in order to prevent an user, to access (peek, write) another user’s or system’s memory.

Authorisation

Authorisation restricts what an authorised identity can make.

This is to protect the systems and data from the users.

Integrity

Integrity is concerned with ensuring that information is genuine and has not been tampered with.

In modern days it easy extremely easy to forge and manipulate the truth with fake media. It is important to understand what is true, and what is not. One way of achieving this is accountability or auditing, knowing when who has done what.

Audit

Auditing is the process of conducting a systematic review of something.

In computer terms, it requires logging and recording all the actions the users have done on the system.

This can be useful to have an history of the files, but also to find trails left by users with malicious intent. An audit mechanism also acts as deterrent.

Audit logs can still be hacked and covered by skilled attackers.

Computer Forensics

Computer forensics is the practice of collecting, analysing and reporting on digital data in a way that is legally admissible. It can be used in the detection and prevention of crime and in any dispute where evidence is stored digitally.

A digital evidence is any piece of information being subject to human intervention or not, that can be extracted from a computer, that is presented in a human-readable format, for instance, to be presented in a court of law.

Computer forensics is used by:

The Incident Response Process

Preparation

During this phase a Incident Report Team (IRT) is set up, and the System is prepared to avoid, mitigate and logs eventual attacks.

Detection

Prepare the system so that if an intrusion happens, it is detected and an appropriate response is followed.

Data mining

Data mining can be used to automatically detect:

Intrusion Detection

Traditional malware detection is signature-based, meaning that it detects the exact bytes (or portion of such) of the malicious code. This makes it not effective against newly spreading malware (until they are detected a the malware registry is updated), and against malware that purposefully alter their code to avoid detection.

Automated detection can have two approaches:

Buffer Overflow Detection

Machine learning can be used to detect buffer overflow vulnerability in both source code and compile machine code.

Malicious detection

With analogous methods, it is possible to use machine learning to detect viral emails, by analysing various features of the email.

Initial Response

Toolkit:

Recording the circumstances of the intrusion, and inform involved parties is key.

Data Collection

Collecting data from the affected computers is fundamental to understand what is the magnitude of the damage, and see if it has affected any other machine in the network.

Data Analysis

Once the data has been collected from the victim machine, they are moved to another machine to be analysed.

Media Forensics

Digital Watermarking

Digital watermarking identifies the author of a media, embedding the information directly on the image, in a way that is imperceptible, robust and secure.

Applications:

Image Forensics

Kids that downloaded a pirate copy of Photoshop (and dictators alike) tend to manipulate pictures. How can an educated computer science student with too much free time spot these fakes? Let’s find out in 404 simple steps!

Cryptography

Cryptography is the study and practice of protecting information by data encoding and transformation techniques.

Encryption allows to:

Cryptography plays an important role in identity authentication (passwords, certificates).

Cryptography is also important in data integrity, as digital data can be associated with a (short) digital fingerprint, that will change even if one bite has been altered (hash/checksum).

Terminology

Symmetric and Asymmetric Encryption

Encryption method can use one more more keys to so that:

An algorithm where is defined symmetric. In other words, the same key can be used for both encryption and decryption.

Conversely, algorithms that follow the rule are known as asymmetric. They can be more secure in certain circumstances (opening a secure channel). Asymmetric encryption is very computationally taxing.

Cryptographic Algorithms

A cipher is an algorithm for performing encryption or decryption, a series of well-defined steps that can be followed as a procedure.

// TODO Covers this in more details!

Cryptanalysis — Cryptographic Attacks

Cryptanalysis is the process of breaking ciphers.

There are a few ways of breaking a cipher:

Any algorithm is theoretically breakable, but in practice an algorithm that would take too long to decipher is considered unbreakable.

Symmetric and asymmetric algorithms have different vulnerabilities:

Looking for patterns is a very effective way against substitution ciphers.

For a given language, the frequency of letters and words is known. This can be used to guess which character/word is which, allowing to progressively make a decryption algorithm, trying a small set of substitutions until they make sense. One-time pads and book ciphers are particularly effective against common patterns exploitation.

Creating Encryption Algorithms

Shannon’s Rules

Stream vs Block Ciphers

There are two main types of ciphers: stream ciphers and block ciphers.

Stream ciphers

Block ciphers

Confusing vs Diffusing

There is also a distinction between confusing vs diffusing ciphers.

Confusing ciphers

Diffusing ciphers

Comparison Matrix

  Stream Block
Advantages Speed, low error propagation Strong diffusion
Disadvantages Weak/no diffusion Slow, weak to error propagation

Using Encryption

Commercial Encryption

Commercial-grade encryption should:

Currently used commercial-grade encryption algorithms:

Data Encryption Standard (DES)

DES is a symmetric key cipher with a 56 bit private key.

Method:

Advanced Encryption Standard (AES)

AES is a symmetric key cipher with a private key with variable length (128, 192, 245 bits)

Advantages:

Method:

Rivest-Shamir-Adelman (RSA)

RSA is an asymmetric cipher, using a set of public and private keys.

In RSA:

Hash Functions

A hash function is a one-way function that can be used to map data of arbitrary size to data of a fixed size. The values returned by a hash function are called hash values, hash codes, digests, or simply hashes.

Hash functions are used in data integrity, as even a bit of difference in the input data, would return a totally different hash value. So it easy to spot if the data has been tampered with.

Hash function are also used in authorisation, as they are generally used to store passwords in databases, so that they cannot be reversed.

Common hashing functions are:

Key Exchange

Asymmetric keys allow a receiver to ask a sender to so that a message that only is able to decode, even though has published an encryption key.

HTTPS, being based on TLS/SSL, uses key exchange and certificates to create a shared key, to create a secure channel that is then encrypted with a symmetric algorithm.

Diffie-Hellman

Diffie-Hellman is a key exchange algorithm and allows two parties to establish, over an insecure communications channel, a shared secret key that only the two parties know, even without having shared anything beforehand.

Note: Although it uses the same principles of private and public key, Diffie-Hellman is not an asymmetrical encryption algorithm, as no encryption happens. On the other hand, it’s a secure method of creating a shared private key, so that the communication can be encrypted using a symmetric encryption algorithm. It is, however, an essential building-block, and was in fact the base upon which asymmetric crypto was later built.

Key Exchange

Alice and Bob exchange a large prime modulus and a generator (both represented as the common paint).

Alice and Bob separately decide two private numbers, and respectively (secret colours). Then they compute the to public keys as follows:

A:

B:

And publicly exchange and .

Then they both execute:

A:

B:

And here’s the magic:

Both Alice and Bob have generated the same , without that it was sent over an unsecure channel. At this point, Alice and Bob can use to communicate on a channel encrypted with symmetric algorithms.

In order for Eve, an eavesdropper, to figure out , she would have to solve , which is (in principle) computationally hard.

Okay, I think I actually understood this, my head is about to explode, bye.

Digital Signatures

Digital signatures guarantee that a particular identity (person/organisation) sent a message.

Digital signatures need to be:

Asymmetric Digital Signatures rely on the fact that algorithms such as RSA are commutative:

This is probably wrong in the slides.

Meaning that a private key of a set can be used to:

  1. Decrypt a message only intended for the recipient, which may be encrypted by anyone having the public key (asymmetric encrypted transport).
  2. Encrypt a message which may be decrypted by anyone, but which can only be encrypted by one person (signature).

The process:

Alice sends a cipher text version of the message such that .

Bob performs the operation , which is equivalent to .

If the message is guaranteed to be from Alice. If , the message was not from Alice, or it was tampered with.

Digital Certificates

A digital certificate, also known as a public key certificate or identity certificate, is an electronic document used to prove the ownership of a public key. The certificate includes information about the key, information about the identity of its owner (called the subject), and the digital signature of an entity that has verified the certificate’s contents.

If the signature is valid, and the software examining the certificate trusts the issuer (CA), then it can use that key to communicate securely with the certificate’s subject. Digital Certification Process

Steps:

  1. The developer:
    1. Generates their own public/private key pair
    2. Creates a Certificate Signing Request (CSR) containing information about the identity
    3. Sends CSR to Certificate Authority (CA)
  2. Certificate Authority:
    1. Checks Integrity of CSR
    2. Checks authenticity of CSR ID
    3. CA Creates a certificate containing identity and signed via CA private key
    4. The certificate public key is available, allowing the CA signature to be checked via decryption
  3. The developer:
    1. Publishes an application
    2. Signs it with the private key and provides CA signed certificate as verification
    3. The developer’s public key verifies the app, the CA’s public key decrypts the certificate and indicates that the developer’s identity is valid.

Trust

Digital certificates allow to establish a trusted connection between two parties that have never met. This is a centralised approach, because it all depends on the CAs to be valid and trustworthy.

For both political and security reason, it could be argued that the certificate authority shouldn’t be centralised, rather be distributed between all the peers on the network, this leads to the creation of blockchain, where the trust is given by the majority of the peers accepting something as true.

Securing Software

Managing Passwords

When designing a system, it is best to decide not to store passwords as plaintext, as they would be vulnerable in case of a breach.

Normally, passwords are stored as salted hashes, since reversible encryption is not required. A salt is to prevent that someone creates a plaintext -> hash table (rainbow tables), that would be easy to search. A salt is stored along the username and is added to the password (hence the name), so that is randomised. A salt is a random sequence of bytes. This way, even if two users have the same password, the salted hash will be different.

When the users enters a password to attempt the login, the salt is fetched from the username.

if the user is authenticated.

Estimating Program Security

Buffer Overflow

A buffer overflow, or buffer overrun, is an anomaly where a program, while writing data to a buffer, overruns the buffer’s boundary and overwrites adjacent memory locations. This can be exploited by an attacker to inject malicious code in memory.

Buffer Overflow Attack

Programming in low level languages like C/C++ (C, especially) makes it more likely to introduce buffer overflow vulnerabilities. Java/C# validate the buffer size at run time (making them slower), but are relatively safe on that point of view.

Incompatible or Unexpected Data

If a conversion between two data types is possible (e.g. a 32 bit float to 32 bit integer), but not expected, it might result on an overflow (the number rolling over). This put the system in an inconsistent state (imagine it’s a bank balance!), and/or crash it entirely. It might be even possible to inject malicious code.

Escaped Sequences

Websites that uses scripting languages, are vulnerable to escaped strings being sent by the user that might get executed.

SQL Injection

It relies on escaping the intended use of a form field, so that an SQL query could be constructed and injected.

xkcd on SQL Injection

A typical attack consists in exploiting this vulnerability and gaining administrator rights to the website.

SQL Injection

In order to prevent this (and similar attacks), it important to properly validate and sanitise inputs from user.

Rogue Software

Types of malware

Malicious software can be classified according to the way they propagate:

Or on the way they are activated:

Finally, they can also be classified on what they do

Structure of a malware

A malware consists of three elements:

Computer Viruses

Not all malware are viruses. A virus is a specific type of malware that spreads by attaching itself to a program, so that when that program is executed by the user, the virus is executed as well.

Trojans

A Trojan horse is any malicious computer program which misleads users of its true intent. They usually involve a legitimate or fake program, which has an embedded malicious code. Their payload usually involves a backdoor, that gives remote access to the attacker.

Worms

Worms are stand-alone computer programs that spread over the network autonomously, usually exploiting vulnerabilities.

Detection of rogue programs

Rogue programs are usually detected by

Securing Network

Three threat exists when trying to access a networked service:

Network Attacks

Network Defences

Kerberos

Kerberos (/ˈkɜːrbərɒs/) is a computer network authentication protocol that works on the basis of tickets to allow nodes communicating over a non-secure network to prove their identity to one another in a secure manner. Kerberos builds on symmetric key cryptography and requires a trusted third party, and optionally may use public-key cryptography during certain phases of authentication.

Kerberos proves a centralised authentication server to authenticate users to a service.

Once the user is signed in on Kerberos, all the allowed environments secured by Kerberos are accessible to the user.

Firewalls make the (wrong) assumptions that all attacks come from the outside. With Kerberos, the assumption is that each connection between machines is a weak link in the security.

Kerberos Network

A client asks the authentication server (AS) a ticket granting ticket, with an authentication request. The AS provide the client with the TGT (if everything was okay), so the client has a temporary session where it is logged in the network. Next time the clients needs a ticket, it will ask right away to the server, and then call the required service.

Kerberos design principles: