3-minute fever 三分鐘熱度: August 2020

Monday, August 31, 2020

MIME - Multipurpose Internet Mail Extensions

Today marks the last day for my 31-day challenge to explain IT related term. I didn't have any actual plan for this, so I got my terms randomly from bookmarked URL, books, or what I heard/read.

DKIM was one of the terms that I found in an email raw message. Today's word, MIME is the second term I choose from the email raw message. :P

MIME, or Multipurpose Internet Mail Extensions, is an Internet standard to extends the email content format that contains text in character sets other than ASCII, with attachments in audio/image/video/any other non-text file, or it is a multi-part message.

I thought this is the part to define the content of the email, but actually, it is not. Normally, in an email, you'll just specify the MIME version, and the format of the message is defined in Content-Type.

I inspected a few of received newsletters in my mailbox, seems like in single email, we can have both plain text and html content to be sent in single email! There's a "divider" in the email message called boundary, that separated the content in plain text and html content. This makes me wonder, did the email marketing platform that I worked with... did we support this? I remember the email format sent was based on a flag for plain text or html, but not both?

Further reads:

Sunday, August 30, 2020

DKIM - DomainKeys Identified Mail

DKIM or DomainKeys Identified Mail is a security measurement to check if an email is sent from authorized or unauthorized sender.

It adds a digital signature to every message sent. This allows the receiving server can verify if the message is forged.

A sender can pretend the message is sent from abcbank.com, for example. You might see the sender email is admin@abcbank.com, but actually it is sent from malicious source. Checking the DKIM, will reveal if this email is indeed sent from a specific domain, that is authorized by the domain's owner.

2 other terms related to DKIM are SPF (Sender Policy Framework) and DMARC (Domain-based Message Authentication, Reporting, and Conformance).

Further reads:

Saturday, August 29, 2020

WebSockets

WebSockets is a protocol introduced at session layer in OSI (Open Systems Interconnection) model and rely on TCP (Transmission Control Protocol) to connect and send data. It provides full-duplex communication, i.e. enable bi-directional network data transfer at the same time.

It is often compared to HTTP/2.0 (Hypertext Transfer Protocol v2), however it is not meant to replace HTTP protocol. HTTP and WebSockets have their own pros and cons, developer should choose carefully which protocol to be used for optimal performance of the system.

Anyway, to start the connection for WebSockets, the client will always ask the server if the server supports WebSockets. This request is sent as HTTP-request. Once the server confirm by sending HTTP-response, then the client-server will start to communicate using WebSockets.

Further reads:

Friday, August 28, 2020

Broadband

I asked my sister, any IT jargon that you'd like to know? She answered, broadband.

I am like... What?!

She challenged back, "See, you can't explain!"

"Broadband is high speed Internet."

"So, broadband is Internet?"

"It's high speed Internet."

Then I tried to mimic Jaring dialing sound. :D

I googled a bit, and found this article https://www.broadbandgenie.co.uk/broadband/help/beginners-guide-to-broadband. The dial-up connection is called "narrowband".

Thursday, August 27, 2020

Webhook

My first encounter with this term is with Github, when I "reactivated" my Github account again early this year.

After reading a few articles, I am not sure if I really get what webhook is, but here's my understanding.

Webhook is something like a callback http api. It is triggered after a defined event happened.

In Github, it can be set in a repository. The target URL (or it's called payload URL) could be setup in other server, which is accessible via http post request. You can select upon which event run, this webhook should be triggered. In Github, the default settings is, when any push actvity happen, this webhook will be called. The target URL could be an API to inform a list of recipient on the push activity.

Due to of this "event-trigger" attribute and always relates to notification, some define webhook as an automated notification. However, webhook can do more than that.

It could be used in sending data to another system, say Kafka, for big data collection, analysis and integration purpose; it could be used in CI/CD, whenever there's a code push, trigger the auto-build and auto-test process; or it could be just simply a notification of an event like account login or money transfer purpose.

Further reads:

Wednesday, August 26, 2020

CIA - confidentiality, integrity, availability

CIA triad, confidentiality, integrity, availability, is the three-fundamentals in security principle.

Confidential ensures the data or information is only accessible by authorised personnel only.

Integrity ensures the data is not tampered or destroyed by unauthorised activity.

Availability ensure the data or services are accessible by authorised personnel at any time.

When a security incident happened, you can be sure one or more of these principles are violated.

Further reads:

Tuesday, August 25, 2020

Computer Virus

Today's term is about virus. Not talking about the Covid-19, but about computer virus.

Computer virus shares the "virus" name, because they share common characteristics. They replicate themselves and spread from host to host.

A computer virus, is a self-replicating program, that produces its own copy by modifying other computer program, computer boot sector or document. It is generally transmitted via file downloads, infected disk/flash drives, email attachments, or within the infected network.

To countermeasure it as in common sense, we download files only from trusted source, use external disk/flash drives only from trusted source, do not open email or click link or attachment from unknown sender, install and keep anti-virus up-to-date.

Further reads:

Monday, August 24, 2020

Enumeration

Enumeration in computer security, is a phase or process to creates active connections to system and performs direct queries to gain more information about the target.

There are quite a number of techniques to perform enumeration. For examples, some sites use Wordpress and the default admin credentials could be a way to gain more information, if the default credentials are still intact; brute force techniques to find valid user name; using tools like SuperScan to detect open ports on a target computer; and so on.

With this further information, a hacker could proceed to plan for the system hacking. This is one of the step in pre-attack phase.

Further reads:

Sunday, August 23, 2020

Exploit

Exploit, in computer security, is technique or program to leverage the vulnerability in computer or software, that would cause the system to not work as expected. It could caused the system to output or display something that is not expected, and in worst case can cause data breach or system down.

Metasploit is one of the top tools for pen-testing. It has a framework that contains a lot of exploits, and you can write your own exploits too.

Further reads:

I actually planned FQDN for today. But... maybe when I don't have any term in mind next time. :P

Saturday, August 22, 2020

DNS - Domain Name System

DNS or Domain Name System is like a phonebook for the internet. It translate human readable computer hostnames into IP addresses. This process is called DNS name resolution.

For your internet connection, normally you'll use the DNS provided by your ISP (Internet Service Provider), or you can specify your desired DNS. Go to your network connection settings, if you don't see DNS settings there, try to find in advance settings. There was once I remember the default DNS was unable to serve my requests. I Googled and found the workaround, is by using Google public DNS. If you are interested to change it, try 8.8.8.8 for DNS server.

When a request is made, the DNS resolver will send the request to a DNS root nameserver, where it responds with TLD (Top Level Domain : .com / .net) nameserver based on the request. Say, the request is for kdb.jcrys26.com. The request will be redirected to .com TLD nameserver. The TLD server will then responds with the IP of for jcrys26.com's nameserver. A query then will be sent to the domain nameserver, where then the target IP address will be returned. Maybe a diagram would be easier to illustrate this, but I don't feel like draw it today.

If I directly try to access the IP, instead of the URL, will I be able to skip the DNS servers? I tried with kdb.jcrys26.com, using nslookup commands, I got this.

$ nslookup kdb.jcrys26.com
Server:		8.8.8.8
Address:	8.8.8.8#53

Non-authoritative answer:
kdb.jcrys26.com	canonical name = ghs.google.com.
Name:	ghs.google.com
Address: 216.58.196.51

However, http://216.58.196.51 doesn't give the same result as http://kdb.jcrys26.com. But it does give the same result for the canonical name (http://ghs.google.com). I guess this is how the Blogger is setup to use custom domain make it work differently. It should be a common IP that serves for all blogger redirects. I remember there're some other parameter I need to set for this "redirect". :P Check out the further reads section for how browser parses IP address.

Speaking of DNS, I now can hardly remember how we do the DNS handling and IP warming in our MTA (Mail Transfer Agent) in order to setup for sending our client marketing emails to subsribers. Probably I have it somewhere in my work logs. :P Anyway, here I attached the nslookup examples from The Geek Stuffs below for further reads as well.

Further reads:

Friday, August 21, 2020

Database Sharding

First time I see this word sharding, is when I first get in touch with MongoDB... was it like more than 5 years ago?

Anyway, back to the main topic.

Database sharding is a way of database horizontal partitioning into several machines or nodes. Data are stored in one of the node based on the shard key distribution.

If the database is very big, by partitioning the database in this way, the performance can be improved. Each machine/node has its own resource and readwrite process. However, sharding design, or the shard key distribution is vital to get the benefits of this performance improvement. If it is not designed carefully, it could leads to poor performance though.

Further readings:

Thursday, August 20, 2020

ABAC - Attribute-Based Access Control

ABAC or Attribute-Based Access Control is another type of access control model. Instead of create roles for access management, and assign user to the appropriate role in RBAC, ABAC defines policies based on the attributes of the user/object/environment or even function/action in the system to manage the access control.

Due to this complexity, it is more difficult to implement compared to RBAC. However, if the policies and attributes framework is defined, the access management would be easy and can be controlled in more granular.

There are access models combining RBAC and ABAC. Check out the article by Ekran System below.

Further reads:

Wednesday, August 19, 2020

RBAC - Role-Based Access Control

RBAC or Role-Based Access Control is a type of access control models. This model makes the access control implementation easier. Groups or roles are created and the access is defined for each group/role. Users are then assigned to the appropriate role, thus the proper access is granted to the user based on the role assigned.

Instead of having to define or assign each access to each individual user, this approach has greatly reduce the effort to manage the access control.

A lot of RBAC related articles are paid content. sien.

A 4D-Role Based Access Control Model for Multitenancy Cloud Platform is just too... mathematical...

Further reads:

Tuesday, August 18, 2020

RASP - Runtime application self-protection

RASP or Runtime application self-protection is a security measurement implemented at the application that run in Production environment. It captures the request and handle the valication within the application. It can raise alert and prevent an attack by terminating the request operation.

2 closest security measurements mentioned earlier in this 31-day terminology series, to RASP, are IAST and WAF.

RASP is different from IASP, where IAST is focused on identifying vulnerabilities, while RASP focused on protecting against cyber security attacks. IASP normally run in Test environment, while RASP run in Production environment.

RASP is also different from WAF, where WAF performing the filter on the request and response as a proxy without knowing the application, while RASP sits inside the application and "understand" the application.

This is my first time hearing this term. I went through a RASP tool list by G2, and have never heard of any of them, except Contrast which OWASP recommended one of their tools for IAST. :P

Further readings:

Monday, August 17, 2020

IAST - Interactive Application Security Testing

IAST or Interactive Application Security Testing, can be seen as a third testing methodology to complement SAST and DAST. It is like an agent working inside the running application to perform the security testing.

SAST can do code analysis where application is not running, while DAST can perform http scanning when the application is running. IAST can perform code analysis, accompanied by automated/manual testing, to assess the application performance and detect vulnerabilities during run time. It could also assess the control flow and data flow, and could easily integrated into CI/CD pipelines.

Based on OWASP website, Contrast Community Edition is the only free IAST tool available currently.

Though Synopsys give a very good article on IAST, the link on the page for IAST solution happened to link to its SAST solution. :P

Further read:

Sunday, August 16, 2020

DAST - Dynamic Analysis Security Testing

As opposed to SAST, Dynamic Analysis Security Testing or DAST is a black-box testing. It is performed when the application is running. It is normally run using a tool to scan and perform attacks to the web application.

OWASP Zed Attack Proxy (ZAP) is the world's most popular free DAST tool. You could input your home URL into the tool and allow it to perform scan and attack your web application. You can also provide some parameters or authentication credentials to allow the tool to continue to detect vulnerabilities in your web application. It also allows user interactions combined with the tool to complement the DAST testings. It will capture if the web page that user accessed contains any vulnerabilities. The report provides the CWE (Common Weakness Enumeration) ID, description, solution and reference. It is so handy that you could assess and take necessary step on the reported vulnerabilities.

Furhter reads:

https://www.zaproxy.org/zap-in-ten/

Saturday, August 15, 2020

SAST - Static Analysis Security Testing

Static Analysis Security Testing or SAST is considered as white-box testing, where the tester has knowledge and access to the underlying design and source code. The testing performed by analyzing the source code without deploy or run the application.

A lot of time, this testing is automated by using tool to perform the source code scan and analyze. The SAST tools is based on a set of rules to identify known or potential security flaw in the source code.

Performing SAST during development can find the vulnerabilities earlier, and thus can be fixed earlier and easier.

Further readings:

Friday, August 14, 2020

WAF - Web Application Firewall

WAF, or web application firewall is a security measurement to detect and filter anything malicious for web application.

It is a type of reverse-proxy, acting as an intermediary to block any malicious traffic traveling to the web application, and prevents any unauthorized data from leaving the web application.

Some articles for further read.

Thursday, August 13, 2020

Identity federation

Identity federation is a system to integrate or handle the authentication and access control between multiple IdPs and SPs.

Example, an organization employee, Dave, needs to work with multiple applications provided by his own organization and also different organizations. Say, one of the external application is LinkedIn Learning. In the meantime, LinkedIn Learning also provides their service to multiple organizations.

Dave could login to LinkedIn Learning using his company's email or his company account. How this works? By implementing identity federation.

Identity federation establish the trust relationship between the IdP, that could be Azure AD from Dave's organization, and the SP, which is LinkedIn Learning in this case.

Wednesday, August 12, 2020

SAML - Security Assertion Markup Language

SAML is an open standard to allow authentication and authorization request/response exchange.

I normally hear about SP (Service Provider) and IdP (Identity Provider) in SAML (at work). There is a 3rd role involved in SAML, which is the principal (the user).

User will request a service from SP. SP will request an authentication assertion from IdP, then based on the SAML response to decide the access level.

SAML main use case is to support Single-sign on (SSO). The SP and IdP could be from different organization, but "work together" via SAML protocol to make user see them as "one-stop solution".

Some articles for further read.

Tuesday, August 11, 2020

OAuth

OAuth or Open Authentication is an open standard to allow end users authorize 3rd party service to access to their account information without exposing their account credential.

An application that uses OAuth, will first request for authorization acknowledgement from user, where it will be then forwarded to an authorization server to get the access token. This access token will be used to access to the protected resources from the server.

Some articles for further read.

https://auth0.com/docs/protocols/protocol-oauth2

Reference sites:

Monday, August 10, 2020

WebAuthn

I received a newsletter from Okta few weeks ago, and their blog post was talking about WebAuthn. Okta is a company that provides services in IAM (Identity and access management).

WebAuthn is a new (not really that new) W3C (World Wide Web Consortium) recommendation for web authentication using using public key cryptography instead of a password.

It seems similar to https, where the certificate(s) is used for authentication, encryptions and integrity between website and web client, but WebAuthn is between web user and website. Website hold the private key in https case, and user (or user device) hold the private key for WebAuthn case.

I am not sure if WebAuthn will have self-signed or CA (certificate authority) signed concept, just like in https. My main concern is, it must be free. :D

Some articles for further read.

Sunday, August 9, 2020

Single Sign-On - SSO

Single Sign-On is an authentication method to allow a user to login once to an IdP (Identity Provider) and it is authenticated to multiple applications or systems.

It is different from Directory Server Authentication, where the same IdP is used for multiple applications, but user is required to key in credential to login separately to each of the applications.

Some articles for further read.

Saturday, August 8, 2020

Encryption

Encryption is a process to "lock" a data in plain text by encode it into something that is not readable (ciphertext) with a special key or password.

Only people who has the key, or know the password can decode the ciphertext back to readable (plain text) format.

Friday, August 7, 2020

Redis

Redis is a NoSQL (not RDBMS), an in-memory dataset, using cache. It is designed to improve the performance to read/write data. It holds the data in memory directly.

For database that is not in-memory dataset and relational database, it would need to make query to retrieve the data then load to memory. This increase the data access latency, and thus giving impact on the performance.

Application that uses Redis could have a mechanism to write the data to persistent storage periodically. In the case of system restart, the application will reconstruct the data to memory again.

Some articles for further read.

https://redis.io/topics/introduction

Thursday, August 6, 2020

High Availability

I often hear HA at work. There are some HA projects on-going, or done, I am not sure. I am not involved in the projects, yet. I hope. :D

As a cloud service provider, high availability design is one of the vital architecture considerations. With HA architecture, the service would run at optimal performance even it is running at high load or one of the server node is down. Anyway, it is normally measured as the percentage of uptime in a year. Scheduled downtime most of the time does not count in the HA measurement.

I quote this from Wiki : By doing this, they can claim to have phenomenally high availability, which might give the illusion of continuous availability.

There's another numbered system for this HA measurement. One nine, refers to 90%; two nines is 99%; three nines is 99.9%; four nines is 99.99% and so on. The more nines, the better HA the system is.

There are a lot of design principle to ensure HA. For example, redundancies, load balance, failover mechanism, etc.

Some articles for further read.

https://www.freecodecamp.org/news/high-availability-concepts-and-theory/

Update: Availability refers to how long your service is up and running without interruption.

Wednesday, August 5, 2020

Scalable and Elastic

I got confused with these 2 terms. To me, they seemed to be referring to the same thing, until I attended an online course on Azure.

Scalable is you can increase or decrease the resources based on the demand or workload anytime. Cloud computing can support both vertical scaling (scale up) and horizontal scaling (scale out) depends on your need. Scale up is adding resources to existing server, and scale out is adding more server to support the additional loads.

Elastic means, the cloud computing can automatically adding or removing resources based on demand.

The big difference between scalable and elastic is the magic word, automatically. :)

Tuesday, August 4, 2020

Serverless

Serverless or serverless computing is a platform provided by cloud service provider, to allow developer to develop or deploy a piece of code or function, without worry about the resources or underlying infrastructure or operating system. It is currently the most granular cloud computing approach to build or run a service.

It incorporate 2 service models, the Backend as a Service (BaaS) and Function as a Service (FaaS) to allow developer to upload the codes, and service provider to manage the required resources to execute the function when called.

Some articles for further read.

Monday, August 3, 2020

Undercloud and Overcloud

I was thinking to separate this into 2 days. But, to understand undercloud better, I need to know what is overcloud. :)

Undercloud is the very basic setup and infrastructure that is required to setup the "cloud infrastructure" -- overcloud, to be used by consumer. It deploys and manages overcloud.

Red Hat OpenStack Platform (RHOSP) Director is the undercloud that that deploy and manage a complete overcloud infrastructure.

The overcloud, is the production cloud that is used to deploy VMs and containers to perform cloud workload.

Sunday, August 2, 2020

Change data capture - CDC

Recently I attended a webinar by RedHat, Change Data Capture with Debezium and Apache Kafka. CDC is a new jargon to me. I did a quick search on this and transaction logs. They are actually not the same thing.

CDC is an approach to capture changes made to a data source. It records insert, update and delete activities.

It can be fed to an ETL (Extract, Transform and Load application) for data transformation, then sent to target applications, such as for telemetry dashboard, replication to a different database, store into an ODS (Operational Data Store) or data lake, etc.

Some notable articles for further read.

Saturday, August 1, 2020

Cloud computing

This is my 3rd 31-day/1-month challenge, to understand the technical terminology and share my learning. Each day I’ll pick a term, and do some research about it, and try to explain it.

First word, cloud computing.

Cloud computing, based on The National Institute of Standards and Technology (NIST), is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources that can be rapidly provisioned and released with minimal management effort or service provider interaction.

In other words, and based on the 5 characteristics that NIST listed out:

There is a control panel to configure and setup the required resources, without much interaction with the service provider, the first characteristic - on-demand self-service

It is easy to access by any device anywhere, laptops, personal computers, handphones, tablets with wired or wireless connections, the second characteristic - broad network access.

The computing resources are shared resources, supporting multi-tenant model with proper segregation mechanism, the third characteristic - resource pooling

It is scalable based on usage demand, whether scale up or out, and scale down or in. The forth characteristic - rapid elasticity.

The last characteristic, which is equally important to both consumer and service provider is, measured service. The resource usage is measurable, controllable, so the consumer can control their usage and budget, and the service provider could ensure their available resources could cope with their customers demand, and bill the customer accordingly.

For reference, please visit NIST SP 800-145.

I tried to record and narrate my scripts, but I failed to make it to my own acceptable quality. So, I shall continue with this approach for my August 2020 challenge. :)