Monday, October 24, 2022

What is Double exclamation mark in Javascript

 In JavaScript, the double exclamation operator converts an Object to Boolean. This happens such that “falsy” objects become false and “truthy” objects become true.


For example:


!! 0 –> false

!! null –> false

!! undefined –> false

!! 48 –> true

!! “hello” –> true

!! [1, 2, 3] –> true

This is a comprehensive guide on the double exclamation point operator.

references:

https://www.codingem.com/javascript-double-exclamation-operator/#:~:text=In%20JavaScript%2C%20the%20double%20exclamation,%E2%80%9Ctruthy%E2%80%9D%20objects%20become%20true.


Sunday, October 23, 2022

Major components in Alibaba Cloud

Alibaba Cloud CDN (CDN) provides widely distributed nodes and allows you to deliver content, such as websites, audio, and videos. CDN allows users to download files from the nodes nearest to them, accelerating the response to user requests and increasing the success rate. CDN also resolves the delivery latency problem usually caused by distribution, bandwidth, and server performance issues.


Alibaba Server load balancer 

Alibaba Cloud CDN (CDN) provides widely distributed nodes and allows you to deliver content, such as websites, audio, and videos. CDN allows users to download files from the nodes nearest to them, accelerating the response to user requests and increasing the success rate. CDN also resolves the delivery latency problem usually caused by distribution, bandwidth, and server performance issues.


VPC helps you build an isolated network environment based on Alibaba Cloud including customizing the IP address range, network segment, route table, and gateway. In addition, you can connect VPC and a traditional IDC through a leased line, VPN, or GRE to provide hybrid cloud services.


The Elastic Compute Service or ECS is one of the most common services within the Alibaba platform. This allows you to deploy virtual servers within your Alibaba Cloud environment. Most people will require some form of ECS Instance running within their environment as a part of at least one of their solutions.


Block storage is a form of cloud storage that is used to store data, often on storage area networks (SANs). Data is stored in blocks, with each block stored separately based on the efficiency needs of the SAN.


Alibaba Cloud Container Service for Kubernetes (ACK) integrates virtualization, storage, networking, and security capabilities. ACK allows you to deploy applications in high-performance and scalable containers and provides full lifecycle management of enterprise-class containerized applications.


Alibaba Cloud Resource Orchestration Service (ROS) provides developers and system administrators with a simple method to automate deployment and configuration of cloud resources. You can use templates in JSON and YAML formats to describe the configurations of cloud computing resources such as ECS, ApsaraDB for RDS, and SLB and the dependencies between these resources. These templates automatically deploy and configure all cloud resources in different accounts and regions to implement infrastructure as code.



OpenStack Major components


OpenStack Neutron is an SDN networking project focused on delivering networking-as-a-service (NaaS) in virtual compute environments.


What is Mistral?

Mistral is a workflow service. Lots of computations in computer systems nowadays can be represented as processes that consist of multiple interconnected steps that need to run in a particular order. Those steps are often interactions with components distributed across different machines: real hardware machines, cloud virtual machines or containers. Mistral provides capabilities to automate such processes.


Particularly, Mistral can be used, for example, for solving administrator tasks related to managing clusters of software, or for any other tasks that span multiple components and take long to complete. It can also be used as a central component for deploying distributed software in a truly large scale. In any case where the ability to track the progress of the activity becomes crucial, Mistral is a good fit.


A Mistral user can describe such a process as a set of tasks and transitions between them, and upload such a definition to Mistral, which will take care of state management, correct execution order, parallelism, synchronization and high availability. In Mistral terminology such a set of tasks and relations between them is called a workflow.


openstack heat cloudformation

Heat is the main project in the OpenStack Orchestration program. It implements an orchestration engine to launch multiple composite cloud applications based on templates in the form of text files that can be treated like code.


OpenStack Zun 

Zun (ex. Higgins) is the OpenStack Containers service. It aims to provide an API service for running application containers without the need to manage servers or clusters.



Architecture

Zun API: Process REST requests and validate inputted parameters.

Zun Compute: Launch containers and manage compute resources in localhost.

Keystone: Authenticate incoming requests.

Neutron: Provide networking for containers.

Glance: An option to store container images (another option is DockerHub).

Kuryr: A Docker network plugin for connecting containers to neutron networks.



Qinling(Function as a Service in OpenStack)

Qinling is Function as a Service for OpenStack. This project aims to provide a platform to support serverless functions (like AWS Lambda). Qinling could support different container orchestration platforms (Kubernetes/Swarm, etc.) and different function package storage backends (local/Swift/S3) by nature using plugin mechanism.



What is Swift?

The OpenStack Object Store project, known as Swift, offers cloud storage software so that you can store and retrieve lots of data with a simple API. It's built for scale and optimized for durability, availability, and concurrency across the entire data set. Swift is ideal for storing unstructured data that can grow without bound.



OpenStack Cinder 

Cinder is a Block Storage service for OpenStack. It's designed to present storage resources to end users that can be consumed by the OpenStack Compute Project (Nova). This is done through the use of either a reference implementation (LVM) or plugin drivers for other storage. The short description of Cinder is that it virtualizes the management of block storage devices and provides end users with a self service API to request and consume those resources without requiring any knowledge of where their storage is actually deployed or on what type of device.


Octavia is an open source, operator-scale load balancing solution designed to work with OpenStack.


Octavia was borne out of the Neutron LBaaS project. Its conception influenced the transformation of the Neutron LBaaS project, as Neutron LBaaS moved from version 1 to version 2. Starting with the Liberty release of OpenStack, Octavia has become the reference implementation for Neutron LBaaS version 2.


Oenstack Octavia accomplishes its delivery of load balancing services by managing a fleet of virtual machines, containers, or bare metal servers—collectively known as amphorae— which it spins up on demand. This on-demand, horizontal scaling feature differentiates Octavia from other load balancing solutions, thereby making Octavia truly suited “for the cloud.”


OpenStack Horizon 

Horizon is the canonical implementation of OpenStack’s Dashboard, which provides a web based user interface to OpenStack services including Nova, Swift, Keystone, etc.


Firebase notification sending image

 const payload = {

        notification: {

          title: "Test",

          body: message,

          image: 'https://firebasestorage.googleapis.com/v0/b/ntest.appspot.com/o/mmedia%2Fpreviews%2FNature1-min.png?alt=media&token=0241cec4-26b0-4463-8701-c94404d3ee12'

        },

      } 

let response = await admin.messaging().sendToTopic(topic, payload, options);

console.log('response for sendToTopic ',response);


references:

https://stackoverflow.com/questions/38504078/firebase-expandable-notification-show-image-when-app-is-in-background

Saturday, October 22, 2022

What is AWS Direct Connect?

 AWS Direct Connect is a network service that provides an alternative to using the Internet to utilize AWS cloud services. AWS Direct Connect enables customers to have low latency, secure and private connections to AWS for workloads which require higher speed or lower latency than the internet.



What is Hot vs Cold Storage

What Is Hot Storage and Cold Storage? Hot storage refers to fast, easy-to-access data storage like your local hard drive or a quick-access cloud storage provider like Google Drive. In contrast, cold storage is for archival data that is rarely accessed and usually stored off site



Wednesday, October 19, 2022

Postgress Database in container how to view the data

 Once enter into the container terminal,

psql -U postgres (postgres is the database name) 


Now to list the tables,


\d+


Now to view contents of a table 


TABLE upstreams;


references:

https://stackoverflow.com/questions/26040493/how-to-show-data-in-a-table-by-using-psql-command-line-interface


Curl 409 response code - what is it?

 HTTP 409 error status: The HTTP 409 status code (Conflict) indicates that the request could not be processed because of conflict in the request, such as the requested resource is not in the expected state, or the result of processing the request would create a conflict within the resource.

Tuesday, October 18, 2022

AI/ML AttributeError: 'CRF' object has no attribute 'keep_tempfiles'

this can be fixed by just adding try catch.

try:

    crf.fit(X_train, y_train)

except AttributeError:

    pass

predictions = crf.predict(X_test)

references:

https://stackoverflow.com/questions/66059532/attributeerror-crf-object-has-no-attribute-keep-tempfiles

Thursday, October 13, 2022

What is Mongo GridFSBucket

GridFS files are stored in the database using two collections, normally called “fs.files” and “fs.chunks”. Each file uploaded to GridFS has one document in the “fs.files” collection containing information about the file and as many chunks as necessary in the “fs.chunks” collection to store the contents of the file.

A GridFS “bucket” is the combination of an “fs.files” and “fs.chunks” collection which together represent a bucket where GridFS files can be stored.

GridFSBucket

A GridFSBucket object is the root object representing a GridFS bucket.

You should always use a GridFSBucket object to interact with GridFS instead of directly referencing the underlying collections.

You create a GridFSBucket instance by calling its constructor:

IMongoDatabase database;

var bucket = new GridFSBucket(database);

You can also provide options when instantiating the GridFSBucket object:

IMongoDatabase database;

var bucket = new GridFSBucket(database, new GridFSOptions

{

    BucketName = "videos",

    ChunkSizeBytes = 1048576, // 1MB

    WriteConcern = WriteConcern.Majority,

    ReadPreference = ReadPeference.Secondary

});

The BucketName value is the root part of the files and chunks collection names, so in this example the two collections would be named “videos.files” and “videos.chunks” instead of “fs.files” and “fs.chunks”.

The ChunkSizeBytes value defines the size of each chunk, and in this example we are overriding the default value of 261120 (255MB).

The WriteConcern is used when uploading files to GridFS, and the ReadPreference is used when downloading files from GridFS.

referencs:

https://mongodb.github.io/mongo-csharp-driver/2.4/reference/gridfs/gettingstarted

Wednesday, October 12, 2022

Mongo GridFS how to add files

mongofiles.exe -d gridfs put song.mp3

Here, gridfs is the name of the database in which the file will be stored. If the database is not present, MongoDB will automatically create a new document on the fly. Song.mp3 is the name of the file uploaded. To see the file's document in database, you can use find query −

db.fs.files.find()

The above command returned the following document −

{

   _id: ObjectId('534a811bf8b4aa4d33fdf94d'), 

   filename: "song.mp3", 

   chunkSize: 261120, 

   uploadDate: new Date(1397391643474), md5: "e4f53379c909f7bed2e9d631e15c1c41",

   length: 10401959 

}

We can also see all the chunks present in fs.chunks collection related to the stored file with the following code, using the document id returned in the previous query −

db.fs.chunks.find({files_id:ObjectId('534a811bf8b4aa4d33fdf94d')})

references:

https://www.tutorialspoint.com/mongodb/mongodb_gridfs.htm#:~:text=GridFS%20is%20the%20MongoDB%20specification,document%20size%20limit%20of%2016MB.

Mongo GridFS Storage

GridFS is the MongoDB specification for storing and retrieving large files such as images, audio files, video files, etc. It is kind of a file system to store files but its data is stored within MongoDB collections. GridFS has the capability to store files even greater than its document size limit of 16MB.


GridFS divides a file into chunks and stores each chunk of data in a separate document, each of maximum size 255k.


GridFS by default uses two collections fs.files and fs.chunks to store the file's metadata and the chunks. Each chunk is identified by its unique _id ObjectId field. The fs.files serves as a parent document. The files_id field in the fs.chunks document links the chunk to its parent.


Following is a sample document of fs.files collection −


Following is a sample document of fs.files collection −

{

   "filename": "test.txt",

   "chunkSize": NumberInt(261120),

   "uploadDate": ISODate("2014-04-13T11:32:33.557Z"),

   "md5": "7b762939321e146569b07f72c62cca4f",

   "length": NumberInt(646)

}

Following is a sample document of fs.chunks document −

{

   "files_id": ObjectId("534a75d19f54bfec8a2fe44b"),

   "n": NumberInt(0),

   "data": "Mongo Binary Data"

}

references:

https://www.tutorialspoint.com/mongodb/mongodb_gridfs.htm#:~:text=GridFS%20is%20the%20MongoDB%20specification,document%20size%20limit%20of%2016MB.


Docker storage options

ephemeral storage - Depends on OS, when container is shutdown, the storage will be lost 

persistent Storage - Stored outside of the container

Docker data volumes

Docker data volumes provide the ability to create a resource that can be used to persistently store and retrieve data within a container. The functionality of data volumes was significantly enhanced in Docker version 1.9, with the ability to assign meaningful names to a volume, list volumes, and list the container associated with a volume.

Data volumes are a step forward from storing data within the container itself and offered better performance for the application. A running container is built from a snapshot of the base container image using file-based copy on write techniques so any data stored natively in the container attracts a significant overhead to manage. Data volumes sit outside this CoW mechanism and exist on the host filesystem, so they're more efficient to read and write to.

However, there are issues with using data volumes. For example, an existing volume can’t be attached to a running or new container, which means a volume can end up orphaned.

Data volume container


An alternative solution is to use a dedicated container to host a volume and to mount that volume space to other containers -- a so-called data volume container. In this technique, the volume container outlasts the application containers and can be used as a method of sharing data between more than one container at the same time.


Having a long-running container to store data provides other opportunities. For instance, a backup container can be spun up that copies or backs up the data in the container volume, for example. In both of the above scenarios, the container volume sits within the file structure of the Docker installation, typically /var/lib/docker/volumes. This means you can use standard tools to access this data, but beware, Docker provides no locking or security mechanisms to maintain data integrity.



Directory mounts


A third option for persistent data is to mount a local host directory into a container. This goes  a step further than the methods described above in that the source directory can be any directory on the host running the container, rather than one under the Docker volumes folder. At container start time, the volume and mount point are specified on the Docker run command, providing a directory within the container that can be used by the application, e.g., data.


Storage plugins


Probably the most interesting development for persistent storage has been the ability to connect to external storage platforms through storage plugins. The plugin architecture provides an interface and API that allows storage vendors to build drivers to automate the creation and mapping of storage from external arrays and appliances into Docker and to be assigned to a container.


Today there are plugins to automate storage provisioning from HPE 3PAR, EMC (ScaleIO, XtremIO, VMAX, Isilon), and NetApp. There are also plugins to support storage from public cloud providers like Azure File Storage and Google Compute Platform.


Plugins map storage from a single host to an external storage source, typically an appliance or array. However, if a container is moved to another host for load balancing or failover reasons, then that storage association is lost. ClusterHQ has developed a platform called Flocker that manages and automates the process of moving the volume associated with a container to another host. Many storage vendors, including Hedvig, Nexenta, Kaminario, EMC, Dell, NetApp and Pure Storage, have chosen to write to the Flocker API, providing resilient storage and clustered container support within a single data center.


https://www.edureka.co/commun

Tuesday, October 11, 2022

AI/ML Spacy package

English pipeline optimized for CPU. Components: tok2vec, tagger, parser, senter, ner, attribute_ruler, lemmatizer.


import spacy

from spacy.lang.en.examples import sentences 

nlp = spacy.load("en_core_web_sm")

doc = nlp(sentences[0])

print(doc.text)

for token in doc:

    print(token.text, token.pos_, token.dep_)


‘en’ stands for English language, which means you are working specifically on English language using the spaCy library.

‘core’ stands for core NLP tasks such as lemmatization or PoS tagging, which means you are loading the pre-built models which can perform some of the core NLP-related tasks.

‘web’ is the pre-built model of the spaCy library which you will use for NLP tasks that are trained from web source content such as blogs, social media and comments.

‘sm’ means small models which are faster and use smaller pipelines but are comparatively less accurate. As a complement to ‘sm’, you can use ‘lg’ or ‘md’ for larger pipelines which will be more accurate than ‘sm’.

 


references:

https://spacy.io/models/en#en_core_web_sm

Monday, October 10, 2022

AI/ML What is a heteronym

A heteronym is a word that has a different pronunciation and meaning from another word but the same spelling. These are homographs that are not homophones. Thus, lead and lead are heteronyms, but mean and mean are not, since they are pronounced the same

Reference:

https://en.wikipedia.org/wiki/Heteronym_(linguistics)

AI/ML POS tagging using Spacy

import spacy

nlp = spacy.load("en_core_web_sm")

doc = nlp("Am learning AI/ML from Upgrad")

for token in doc:

    print(token.text, token.pos_, token.tag_)


Am AUX VBP

learning VERB VBG

AI PROPN NNP

/ SYM SYM

ML PROPN NNP

from ADP IN

Upgrad PROPN NNP


AI/ML NN vs. VBG

Definition of VBG (Verb gerund or present participle)

Verbs ending with -ing, called gerund forms of verbs, can function as nouns (called verbal nouns), and it exhibits the ordinary properties of a noun. Though derived from a verb, a verbal noun is strictly a noun, and it exhibits nominal properties: 


It takes determiners like the and this;

It permits adjectives (but not adverbs);

It permits following prepositional phrases (but not objects);

And it can even be pluralized if the sense permits.


Examples:

Shooting paintballs is not an art form.   Shooting is a noun. 

Humor is laughing at what you haven't got when you ought to have it.   Laughing is a noun. 


In contrast, a verb gerund is still a verb, and it exhibits ordinary verbal properties, such as taking objects and adverbs.  Notice the difference in the following example:


Examples:


In football, deliberately tripping an opponent is a foul.      (VBG)     

Tripping is a verb gerund form because there is no determiner the before it


In football, the deliberate tripping of an opponent is a foul.  (NN)

Here the verbal noun tripping takes the determiner the, the adjective deliberate and the prepositional phrase of an opponent, but it exhibits no verbal properties at all.


Examples:


The building of the British Empire may be said to have begun with the ascent of Queen Elizabeth to the throne.

His acting of the part of Othello was distinguished by a breadth and grandeur that placed it far beyond the efforts of other actors.

The dead might as well try to speak to the living as the old to the young.


In conclusion, Verb gerund and verbal nouns sometimes can be fuzzy because they appear in a same -ing form, a simple rule is to check 

Whether a token follows after determiners like "a", "an", "the", "this", or adjectives (not adverbs). If it does, it is a noun (verbal noun); 

Whether a token is followed by a prepositional phrase (but not object). If does, it is a noun (verbal noun);

Whether it can be pluralized if the sense permits. If does, it is a noun;

In other cases, it is more likely a verb (gerund verb). 

references:

https://sites.google.com/site/partofspeechhelp/home/nn_vbg

AI/ML PoS NN vs NNP

Definition of NNP (Proper Noun)

Ref main page

Proper nouns (NNP) name specific people, places, things, or ideas.  Since they these nouns are naming specific things, they always begin with a capital letter. 

Examples:

Britney, Paris, Rover, Nike

Sometimes, proper nouns contain two or more important words.

Examples:

Britney Spears,     Central Park Zoo,     Pacific Ocean

Definition of NN  (Common Noun)

Ref main page

Common nouns are the opposite of proper nouns. They are your run of the mill, generic nouns. They name people, places, things or ideas that are not specific.

Examples:

woman, city, dog, shoe

References:

https://sites.google.com/site/partofspeechhelp/home/nn_nnp

AI/ML What are POS tags

Number

Tag

Description

1. CC Coordinating conjunction

2. CD Cardinal number

3. DT Determiner

4. EX Existential there

5. FW Foreign word

6. IN Preposition or subordinating conjunction

7. JJ Adjective

8. JJR Adjective, comparative

9. JJS Adjective, superlative

10. LS List item marker

11. MD Modal

12. NN Noun, singular or mass

13. NNS Noun, plural

14. NNP Proper noun, singular

15. NNPS Proper noun, plural

16. PDT Predeterminer

17. POS Possessive ending

18. PRP Personal pronoun

19. PRP$ Possessive pronoun

20. RB Adverb

21. RBR Adverb, comparative

22. RBS Adverb, superlative

23. RP Particle

24. SYM Symbol

25. TO to

26. UH Interjection

27. VB Verb, base form

28. VBD Verb, past tense

29. VBG Verb, gerund or present participle

30. VBN Verb, past participle

31. VBP Verb, non-3rd person singular present

32. VBZ Verb, 3rd person singular present

33. WDT Wh-determiner

34. WP Wh-pronoun

35. WP$ Possessive wh-pronoun

36. WRB Wh-adverb



References:

https://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html

AI/ML Damerau–Levenshtein distance

In information theory and computer science, the Damerau–Levenshtein distance (named after Frederick J. Damerau and Vladimir I. Levenshtein[1][2][3]) is a string metric for measuring the edit distance between two sequences. Informally, the Damerau–Levenshtein distance between two words is the minimum number of operations (consisting of insertions, deletions or substitutions of a single character, or transposition of two adjacent characters) required to change one word into the other.


The Damerau–Levenshtein distance differs from the classical Levenshtein distance by including transpositions among its allowable operations in addition to the three classical single-character edit operations (insertions, deletions and substitutions).[4][2]


In his seminal paper,[5] Damerau stated that in an investigation of spelling errors for an information-retrieval system, more than 80% were a result of a single error of one of the four types. Damerau's paper considered only misspellings that could be corrected with at most one edit operation. While the original motivation was to measure distance between human misspellings to improve applications such as spell checkers, Damerau–Levenshtein distance has also seen uses in biology to measure the variation between protein sequences.[6]


References:

https://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance#:~:text=Informally%2C%20the%20Damerau%E2%80%93Levenshtein%20distance,one%20word%20into%20the%20other.

Sunday, October 9, 2022

Firebase database rules - some notes

In this below rules, newData seems to be something that is future once after the write is done. 

So, to check something on the newData, can place it in validate. 


{  

  "rules": {

    "myproj": {

      "chats":{

         "$roomId":{

            ".read":  "auth.uid != null",

            ".write":  "auth.uid != null",

            ".validate" : "newData.child('nick').val() == 'myproj11'"  

         }

      }

    }

  }

}



AI/ML Phonetic Hashing

Reducing a word to its base form using Stemming and Lemmatization is a part of the technique called Canonicalisation. Stemming tries to reduce a word to its root form. Lemmatization tries to reduce a word to its lemma. The root and the lemma are nothing but the base forms of the inflected words. just that the method is different in both.


There are some cases that can’t be handled either by stemming nor lemmatization. You need another preprocessing method in order to stem or lemmatize the words efficiently.


For example if the corpus contains two misspelled versions of the word ‘disappearing’ — ‘dissappearng’ and ’dissapearing’. After you stem these words, you’ll have two different stems — ‘dissappear’ and ‘dissapear’. You still have the problem of redundant tokens. On the other hand, lemmatization won’t even work on these two words and will return the same words if it is applied because it only works on correct dictionary spelling.



To deal with different spellings that occur due to different pronunciations, we use the concept of phonetic hashing which will help you canonicalise different versions of the same word to a base word.



Phonetic hashing is done using the Soundex algorithm. It doesn’t matter which language the input word comes from — as long as the words sound similar, they will get the same hash code.


References:

https://amitg0161.medium.com/phonetic-hashing-and-soundex-in-python-60d4ca7a2843




Saturday, October 8, 2022

Firebase database security rule samples

Firebase database security rule samples 

All Authenticated users

{

  "rules": {

    ".read": "auth.uid !== null",

    ".write": "auth.uid !== null"

  }

}


Content owner only access 

{

  "rules": {

    "some_path": {

      "$uid": {

        // Allow only authenticated content owners access to their data

        ".read": "auth !== null && auth.uid === $uid",

        ".write": "auth !== null && auth.uid === $uid"

      }

    }

  }

}



Mixed public and private access 


{

// Allow anyone to read data, but only authenticated content owners can

// make changes to their data


  "rules": {

    "some_path": {

      "$uid": {

        ".read": true,

        // or ".read": "auth.uid !== null" for only authenticated users

        ".write": "auth.uid === $uid"

      }

    }

  }

}



Attribute based Role based access 


{

  "rules": {

    "some_path": {

      "${subpath}": {

        //

        ".write": "root.child('users').child(auth.uid).child('role').val() === 'admin'",

        ".read": true

      }

    }

  }

}


Custom Claim based rules 

{

  "rules": {

    "some_path": {

      "$uid": {

        // Create a custom claim for each role or group

        // you want to leverage

        ".write": "auth.uid !== null && auth.token.writer === true",

        ".read": "auth.uid !== null && auth.token.reader === true"

      }

    }

  }

}


Tenant based rules 


{

  "rules": {

    "some_path": {

      "$uid": {

        // Only allow reads and writes if user belongs to a specific tenant

        ".write": "auth.uid !== null && auth.token.firebase.tenant === 'tenant2-m6tyz'",

        ".read": "auth.uid !== null

      }

    }

  }




Path delineated access 


{

  "rules": {

    "some_path/$uid": {

      ".write": "auth.uid === uid",

      // Create a "public" subpath in your dataset

      "public": {

        ".read": true

        // or ".read": "auth.uid !== null"

      },

      // Create a "private" subpath in your dataset

      "private": {

        ".read": "auth.uid === uid"

      }

    }

  }

}

References :

https://firebase.google.com/docs/rules/basics 

Security Rules and Firebase Authentication

Firebase Security Rules provide access control and data validation in a format that supports multiple levels of complexity. To build user-based and role-based access systems that keep your users' data safe, use Firebase Authentication with Firebase Security Rules.

Authentication identifies users requesting access to your data and provides that information as a variable you can leverage in your rules. The auth variable contains the following information:

uid: A unique user ID, assigned to the requesting user.

token: A map of values collected by Authentication.

The auth.token variable contains the following values:

email The email address associated with the account, if present.

email_verified true if the user has verified they have access to the email address. Some providers automatically verify email addresses they own.

phone_number The phone number associated with the account, if present.

name The user's display name, if set.

sub The user's Firebase UID. This is unique within a project.

firebase.identities Dictionary of all the identities that are associated with this user's account. The keys of the dictionary can be any of the following: email, phone, google.com, facebook.com, github.com, twitter.com. The values of the dictionary are arrays of unique identifiers for each identity provider associated with the account. For example, auth.token.firebase.identities["google.com"][0] contains the first Google user ID associated with the account.

firebase.sign_in_provider The sign-in provider used to obtain this token. Can be one of the following strings: custom, password, phone, anonymous, google.com, facebook.com, github.com, twitter.com.

firebase.tenant The tenantId associated with the account, if present. e.g. tenant2-m6tyz

You can access custom claims in Rules after creating custom claims in Authentication. You can then reference those custom claims using the auth.token variable.


{

  "rules": {

    "some_path/$sub_path": {

      // Create a custom claim for the admin role

      ".write": "auth.uid !== null && auth.token.writer === true"

      ".read": "auth.uid !== null"

      }

    }

  }


references:

https://firebase.google.com/docs/rules/rules-and-auth

Firebase database - How rules apply to paths

In Realtime Database, Rules apply atomically, meaning that rules at higher-level parent nodes override rules at more granular child nodes and rules at a deeper node can't grant access to a parent path. You can't refine or revoke access at a deeper path in your database structure if you've already granted it for one of the parent paths.

{

  "rules": {

     "foo": {

        // allows read to /foo/*

        ".read": "data.child('baz').val() === true",

        "bar": {

          // ignored, since read was allowed already

          ".read": false

        }

     }

  }

}

While it may not seem immediately intuitive, this is a powerful part of the rules language and allows for very complex access privileges to be implemented with minimal effort. This is particularly useful for user-based security.


However, .validate rules do not cascade. All validate rules must be satisfied at all levels of the hierarchy for a write to be allowed.



Additionally, because rules do not apply back to a parent path, read or write operation fail if there isn't a rule at the requested location or at a parent location that grants access. Even if every affected child path is accessible, reading at the parent location will fail completely. Consider this structure:


{

  "rules": {

    "records": {

      "rec1": {

        ".read": true

      },

      "rec2": {

        ".read": false

      }

    }

  }

}


Without understanding that rules are evaluated atomically, it might seem like fetching the /records/ path would return rec1 but not rec2. The actual result, however, is an error:


var db = firebase.database();

db.ref("records").once("value", function(snap) {

  // success method is not called

}, function(err) {

  // error callback triggered with PERMISSION_DENIED

});



references:

https://firebase.google.com/docs/rules/rules-behavior


AI/ML Porter Stemmer and Snowball Stemmer

Snowball Stemmer: It is a stemming algorithm which is also known as the Porter2 stemming algorithm as it is a better version of the Porter Stemmer since some issues of it were fixed in this stemmer


Stemming: It is the process of reducing the word to its word stem that affixes to suffixes and prefixes or to roots of words known as a lemma. In simple words stemming is reducing a word to its base word or stem in such a way that the words of similar kind lie under a common stem. For example – The words care, cared and caring lie under the same stem ‘care’. Stemming is important in natural language processing(NLP).


Some few common rules of Snowball stemming are:


Few Rules:

ILY  -----> ILI

LY   -----> Nill

SS   -----> SS

S    -----> Nill

ED   -----> E,Nill


Nill means the suffix is replaced with nothing and is just removed.

There may be cases where these rules vary depending on the words. As in the case of the suffix ‘ed’ if the words are ‘cared’ and ‘bumped’ they will be stemmed as ‘care‘ and ‘bump‘. Hence, here in cared the suffix is considered as ‘d’ only and not ‘ed’. One more interesting thing is in the word ‘stemmed‘ it is replaced with the word ‘stem‘ and not ‘stemmed‘. Therefore, the suffix depends on the word.


Word           Stem

cared          care

university     univers

fairly         fair

easily         easili

singing        sing

sings          sing

sung           sung

singer         singer

sportingly     sport


import nltk

from nltk.stem.snowball import SnowballStemmer

  

#the stemmer requires a language parameter

snow_stemmer = SnowballStemmer(language='english')

  

#list of tokenized words

words = ['cared','university','fairly','easily','singing',

       'sings','sung','singer','sportingly']

  

#stem's of each word

stem_words = []

for w in words:

    x = snow_stemmer.stem(w)

    stem_words.append(x)

      

#print stemming results

for e1,e2 in zip(words,stem_words):

    print(e1+' ----> '+e2)


Difference Between Porter Stemmer and Snowball Stemmer:


Snowball Stemmer is more aggressive than Porter Stemmer.

Some issues in Porter Stemmer were fixed in Snowball Stemmer.

There is only a little difference in the working of these two.

Words like ‘fairly‘ and ‘sportingly‘ were stemmed to ‘fair’ and ‘sport’ in the snowball stemmer but when you use the porter stemmer they are stemmed to ‘fairli‘ and ‘sportingli‘.

The difference between the two algorithms can be clearly seen in the way the word ‘Sportingly’ in stemmed by both. Clearly Snowball Stemmer stems it to a more accurate stem.




References:

https://www.geeksforgeeks.org/snowball-stemmer-nlp


Firebase Security Rules - Part1

Firebase Security Rules leverage extensible, flexible configuration languages to define what data your users can access for Realtime Database, Cloud Firestore, and Cloud Storage. Firebase Realtime Database Rules leverage JSON in rule definitions, while Cloud Firestore Security Rules and Firebase Security Rules for Cloud Storage leverage a unique language built to accommodate more complex rules-specific structures.

Firebase Security Rules work by matching a pattern against database paths, and then applying custom conditions to allow access to data at those paths. All Rules across Firebase products have a path-matching component and a conditional statement allowing read or write access. You must define Rules for each Firebase product you use in your app.

{

  "rules": {

    "<<path>>": {

    // Allow the request if the condition for each method is true.

      ".read": <<condition>>,

      ".write": <<condition>>,

      ".validate": <<condition>>

    }

  }

}

Rules are applied as OR statements, not AND statements. Consequently, if multiple rules match a path, and any of the matched conditions grants access, Rules grant access to the data at that path. Therefore, if a broad rule grants access to data, you can't restrict with a more specific rule. You can, however, avoid this problem by making sure your Rules don't overlap too much. Firebase Security Rules flag overlaps in your matched paths as compiler warnings.


{

    "messages": {

      "message0": {

        "content": "Hello",

        "timestamp": 1405704370369

      },

      "message1": {

        "content": "Goodbye",

        "timestamp": 1405704395231

      },

      ...

    }

  }

For a document like the above, below is the rule JSON for firebase database 

{

    "rules": {

      "messages": {

        "$message": {

          // only messages from the last ten minutes can be read

          ".read": "data.child('timestamp').val() > (now - 600000)",


          // new messages must have a string content and a number timestamp

          ".validate": "newData.hasChildren(['content', 'timestamp']) &&

                        newData.child('content').isString() &&

                        newData.child('timestamp').isNumber()"

        }

      }

    }

  }


As the example above shows, Realtime Database Rules support a $location variable to match path segments. Use the $ prefix in front of your path segment to match your rule to any child nodes along the path. Here $message is the location key 

You can also use the $variable in parallel with constant path names.

{

    "rules": {

      "widget": {

        // a widget can have a title or color attribute

        "title": { ".validate": true },

        "color": { ".validate": true },


        // but no other child paths are allowed

        // in this case, $other means any key excluding "title" and "color"

        "$other": { ".validate": false }

      }

    }

  }

Pre-defined variables

now => The current time in milliseconds since Linux epoch. This works particularly well for validating timestamps created with the SDK's firebase.database.ServerValue.TIMESTAMP.

root => A RuleDataSnapshot representing the root path in the Firebase database as it exists before the attempted operation.

newData => A RuleDataSnapshot representing the data as it would exist after the attempted operation. It includes the new data being written and existing data.

data => A RuleDataSnapshot representing the data as it existed before the attempted operation.

$ variables => A wildcard path used to represent ids and dynamic child keys.

auth => Represents an authenticated user's token payload.


These variables can be used anywhere in your rules. For example, the security rules below ensure that data written to the /foo/ node must be a string less than 100 characters:

{

  "rules": {

    "foo": {

      // /foo is readable by the world

      ".read": true,


      // /foo is writable by the world

      ".write": true,


      // data written to /foo must be a string less than 100 characters

      ".validate": "newData.isString() && newData.val().length < 100"

    }

  }

}


references:

https://firebase.google.com/docs/rules/rules-behavior


Friday, October 7, 2022

AI/ML What is Zipf's law

Zipf's law (/zɪf/, German: [ts͡ɪpf]) is an empirical law formulated using mathematical statistics that refers to the fact that for many types of data studied in the physical and social sciences, the rank-frequency distribution is an inverse relation. The Zipfian distribution is one of a family of related discrete power law probability distributions. It is related to the zeta distribution, but is not identical.

Zipf's law was originally formulated in terms of quantitative linguistics, stating that given some corpus of natural language utterances, the frequency of any word is inversely proportional to its rank in the frequency table. Thus the most frequent word will occur approximately twice as often as the second most frequent word, three times as often as the third most frequent word, etc. For example, in the Brown Corpus of American English text, the word "the" is the most frequently occurring word, and by itself accounts for nearly 7% of all word occurrences (69,971 out of slightly over 1 million). True to Zipf's Law, the second-place word "of" accounts for slightly over 3.5% of words (36,411 occurrences), followed by "and" (28,852). Only 135 vocabulary items are needed to account for half the Brown Corpus.[1]

The law is named after the American linguist George Kingsley Zipf, who popularized it and sought to explain it, though he did not claim to have originated it.[2] The French stenographer Jean-Baptiste Estoup appears to have noticed the regularity before Zipf.[3] It was also noted in 1913 by German physicist Felix Auerbach.[4]

This is also called power law distribution

References:

https://en.wikipedia.org/wiki/Zipf%27s_law

Thursday, October 6, 2022

Regex - Common useful ones Part 1

 To extract all the words from a given sentence 

The + character is a special character in regex. It is used to match 1 or more repetitions of the preceding regular expression or class which in our case is [a-z]. So it matches 1 or more repetitions of lower case alphabets and hence we get the above list. If we wanted to include 1 or more repetitions of both lower and upper case alphabets, we can create the pattern as follows:

words_pattern = '[a-zA-Z]+'

Extracting Words Followed by Specific Pattern

Let’s assume that our usernames can only contain alphabets and anything followed by an '@' without any space is a username.

comment = "This is an great article @Bharath. You have explained the complex topic in a very simplistic manner. @Yashwant, you might find this article to be useful."

Let’s create a regex pattern that can be used to search all the usernames tagged in the comment.

username_pattern = '@([a-zA-Z]+)'

re.findall('@([a-zA-Z]+)', comment)


Find all words that are having .ing in it. 

# regex pattern

pattern = ".(ing){1,}"# write regex to extract words ending with 'ing'

# store results in the list 'result'

result =  re.findall(pattern,string) # extract words having the required pattern, using the findall function


referencs:

https://medium.com/quantrium-tech/extracting-words-from-a-string-in-python-using-regex-dac4b385c1b8

Regex - findIter and findall

 The re.finditer() works exactly the same as the re.findall() method except it returns an iterator yielding match objects matching the regex pattern in a string instead of a list.

It scans the string from left to right, and matches are returned in the iterator form. Later, we can use this iterator object to extract all matches.

In simple words, finditer() returns an iterator over MatchObject objects.


But why use finditer()?

In some scenarios, the number of matches is high, and you could risk filling up your memory by loading them all using findall(). Instead of that using the finditer(), you can get all possible matches in the form of an iterator object, which will improve performance.

It means, finditer() returns a callable object which will load results in memory when called. 

references:

https://pynative.com/python-regex-findall-finditer/#:~:text=finditer()%20works%20exactly%20the,returned%20in%20the%20iterator%20form.

Regex Python search vs match

 Python offers two different primitive operations based on regular expressions: match checks for a match only at the beginning of the string, while search checks for a match anywhere in the string (this is what Perl does by default).


Note that match may differ from search even when using a regular expression beginning with '^': '^' matches only at the start of the string, or in MULTILINE mode also immediately following a newline. The “match” operation succeeds only if the pattern matches at the start of the string regardless of mode, or at the starting position given by the optional pos argument regardless of whether a newline precedes it.


If zero or more characters at the beginning of string match the regular expression pattern, return a corresponding MatchObject instance. Return None if the string does not match the pattern; note that this is different from a zero-length match.


Note: If you want to locate a match anywhere in string, use search() instead.


Scan through string looking for a location where the regular expression pattern produces a match, and return a corresponding MatchObject instance. Return None if no position in the string matches the pattern; note that this is different from finding a zero-length match at some point in the string.


references:

https://www.edureka.co/community/14245/what-is-the-difference-between-re-search-and-re-match#:~:text=Python%20offers%20two%20different%20primitive,what%20Perl%20does%20by%20default).

Monday, October 3, 2022

Useful links for Mental health research - Pinkymind

Awesome list of measures 

https://www.cataloguementalhealth.ac.uk/?content=home


Font for mental wellbeing??? 

https://www.crazyegg.com/blog/psychology-of-fonts-infographic/

https://fontsinuse.com/uses/48966/free-therapy


Good designs

https://www.thedenizenco.com/print-design

https://fontsinuse.com/uses/48966/free-therapy


Sunday, October 2, 2022

AI/ML Neural Network: The Dead Neuron

choosing an activation function for the hidden layer is not an easy task. The configuration of the hidden layer is an extremely active topic of research, and it just doesn’t have any theory about how many neurons, how many layers, and what activation function to use given a dataset. Back then, sigmoid is the most popular activation function due to its non-linearity. As time goes by, a neural network advanced to a deeper network architecture that raised the vanishing gradient problem. Rectified linear unit (ReLU) turns out to be the default option for the hidden layer’s activation function since it shuts down the vanishing gradient problem by having a bigger gradient than sigmoid.


The drawback of ReLU is that they cannot learn on examples for which their activation is zero. It usually happens if you initialize the entire neural network with zero and place ReLU on the hidden layers. Another cause is when a large gradient flows through, a ReLU neuron will update its weight and might be ended up with a big negative weight and bias. If this happens, this neuron will always produce 0 during the forward propagation, and then the gradient flowing through this neuron will forever be zero irrespective of the input.


In other words, the weights of this neuron will never be updated again. Such a neuron can be considered as a dead neuron, which is considered a kind of permanent “brain damage” in biological terms. A dead neuron can be thought of as a natural Dropout. But the problem is if every neuron in a specific hidden layer is dead, it cuts the gradient to the previous layer resulting in zero gradients to the layers behind it. It can be fixed by using smaller learning rates so that the big gradient doesn’t set a big negative weight and bias in a ReLU neuron. Another fix is to use the Leaky ReLU

references:

https://towardsdatascience.com/neural-network-the-dead-neuron-eaa92e575748


AI/ML what are General ensemble models

Ensemble modeling is the process of running two or more related but different analytical models and then synthesizing the results into a single score or spread in order to improve the accuracy of predictive analytics and data mining applications.


Ensemble methods have higher predictive accuracy, compared to the individual models


The most popular ensemble methods are boosting, bagging, and stacking. Ensemble methods are ideal for regression and classification, where they reduce bias and variance to boost the accuracy of models.


The most popular ensemble methods are boosting, bagging, and stacking. Ensemble methods are ideal for regression and classification, where they reduce bias and variance to boost the accuracy of models.


Bagging is advantageous since weak base learners are combined to form a single strong learner that is more stable than single learners. It also eliminates any variance, thereby reducing the overfitting of models. One limitation of bagging is that it is computationally expensive. Thus, it can lead to more bias in models when the proper procedure of bagging is ignored.



Boosting is an ensemble technique that learns from previous predictor mistakes to make better predictions in the future. The technique combines several weak base learners to form one strong learner,


Boosting takes many forms, including gradient boosting, Adaptive Boosting (AdaBoost), and XGBoost (Extreme Gradient Boosting). 


 AdaBoost uses weak learners in the form of decision trees, which mostly include one split that is popularly known as decision stumps. AdaBoost’s main decision stump comprises observations carrying similar weights.


Gradient boosting adds predictors sequentially to the ensemble, where preceding predictors correct their successors, thereby increasing the model’s accuracy. New predictors are fit to counter the effects of errors in the previous predictors. The gradient of descent helps the gradient booster identify problems in learners’ predictions and counter them accordingly.


XGBoost makes use of decision trees with boosted gradient, providing improved speed and performance. It relies heavily on the computational speed and the performance of the target model. Model training should follow a sequence, thus making the implementation of gradient boosted machines slow.

references:
https://corporatefinanceinstitute.com/resources/knowledge/other/ensemble-methods/

AI/ML ways to avoid overfitting in Decision trees

 If the decision tree is allowed to train to its full strength, the model will overfit the training data. There are various techniques to prevent the decision tree model from overfitting.

Unlike other regression models, decision tree doesn’t use regularization to fight against over-fitting. Instead, it employs tree pruning. Selecting the right hyper-parameters (tree depth and leaf size) also requires experimentation, e.g. doing cross-validation with a hyper-parameter matrix. 

Pruning

* Pre-pruning

* Post-pruning

Ensemble

* Random Forest

By default, the decision tree model is allowed to grow to its full depth. Pruning refers to a technique to remove the parts of the decision tree to prevent growing to its full depth. By tuning the hyperparameters of the decision tree model one can prune the trees and prevent them from overfitting.

There are two types of pruning Pre-pruning and Post-pruning

Pre-Pruning:

The pre-pruning technique refers to the early stopping of the growth of the decision tree. The pre-pruning technique involves tuning the hyperparameters of the decision tree model prior to the training pipeline. The hyperparameters of the decision tree including max_depth, min_samples_leaf, min_samples_split can be tuned to early stop the growth of the tree and prevent the model from overfitting.


Post-Pruning:

The Post-pruning technique allows the decision tree model to grow to its full depth, then removes the tree branches to prevent the model from overfitting. Cost complexity pruning (ccp) is one type of post-pruning technique. In case of cost complexity pruning, the ccp_alpha can be tuned to get the best fit model.


Ensemble — Random Forest:

Random Forest is an ensemble technique for classification and regression by bootstrapping multiple decision trees. Random Forest follows bootstrap sampling and aggregation techniques to prevent overfitting.

references:

https://towardsdatascience.com/3-techniques-to-avoid-overfitting-of-decision-trees-1e7d3d985a09#:~:text=Pruning%20refers%20to%20a%20technique,%2Dpruning%20and%20Post%2Dpruning.

AI/ML. Advantages of random forest

Random forests are one of the state-of-the-art supervised machine learning methods and achieve good performance in high-dimensional settings where p, the number of predictors, is much larger than n, the number of observations

Random forests is great with high dimensional data since we are working with subsets of data.


Quick Prediction/Training Speed : It is faster to train than decision trees because we are working only on a subset of features in this model, so we can easily work with hundreds of features. Prediction speed is significantly faster than training speed because we can save generated forests for future uses.


Robust to Outliers and Non-linear Data: Random forest handles outliers by essentially binning them. It is also indifferent to non-linear features.


Handles Unbalanced Data: It has methods for balancing error in class population unbalanced data sets. Random forest tries to minimize the overall error rate, so when we have an unbalance data set, the larger class will get a low error rate while the smaller class will have a larger error rate.


Low Bias, Moderate Variance

Each decision tree has a high variance, but low bias. But because we average all the trees in random forest, we are averaging the variance as well so that we have a low bias and moderate variance model.


Parallelizable

They are parallelizable, meaning that we can split the process to multiple machines to run. This results in faster computation time. Boosted models are sequential in contrast, and would take longer to compute.


Side note: Specifically, in Python, to run this in multiple machines, provide the parameter “n_jobs = -1” The -1 is an indication to use all available machines. See scikit-learn documentation for further details.


References:

https://journals.sagepub.com/doi/full/10.1177/0962280220946080

AI/ML why boosting method is sensitive to outliers

Outliers can be bad for boosting because boosting builds each tree on previous trees' residuals/errors. Outliers will have much larger residuals than non-outliers, so gradient boosting will focus a disproportionate amount of its attention on those points.


referenceS:

https://stats.stackexchange.com/questions/140215/why-boosting-method-is-sensitive-to-outliers


AI/ML can early stopping avoid overfitting?

Answer is yes A problem with training neural networks is in the choice of the number of training epochs to use. Too many epochs can lead to overfitting of the training dataset, whereas too few may result in an underfit model.


Early stopping is a method that allows you to specify an arbitrarily large number of training epochs and stop training once the model performance stops improving on the validation dataset.

This requires that a validation split should be provided to the fit() function and a EarlyStopping callback to specify performance measure on which performance will be monitored on validation split.


model.fit(train_X, train_y, validation_split=0.3,callbacks=EarlyStopping(monitor=’val_loss’))

That is all that is needed for the simplest form of early stopping. Training will stop when the chosen performance measure stops improving. To discover the training epoch on which training was stopped, the “verbose” argument can be set to 1. Once stopped, the callback will print the epoch number.


EarlyStopping(monitor=’val_loss’, verbose=1)

Often, the first sign of no improvement may not be the best time to stop training. This is because the model may get slightly worse before getting much better. We can account for this by adding a delay to the trigger in terms of the number of epochs on which we would like to see no improvement. This can be done by setting the “patience” argument.


EarlyStopping(monitor=’val_loss’, mode=’min’, verbose=1, patience=50)

The exact amount of patience will vary between models and problems. there a rule of thumb to make it 10% of number of epoch.



References

https://medium.com/zero-equals-false/early-stopping-to-avoid-overfitting-in-neural-network-keras-b68c96ed05d9#:~:text=A%20problem%20with%20training%20neural,result%20in%20an%20underfit%20model.


AI/ML Dropout Regularization to Handle Overfitting in Deep Learning Models

Dropout regularization is a technique that randomly drops a number of neurons in a neural network during model training


This means the contribution of the dropped neurons is temporally removed and they do not have an impact on the model’s performance.


Dropout regularization will ensure the following:


The neurons can’t rely on one input because it might be dropped out at random. This reduces bias due to over-relying on one input, bias is a major cause of overfitting.

Neurons will not learn redundant details of inputs. This ensures only important information is stored by the neurons. This enables the neural network to gain useful knowledge which it uses to make predictions.


References

https://www.section.io/engineering-education/dropout-regularization-to-handle-overfitting-in-deep-learning-models/#getting-started-with-dropout-regularization


AI/ML What is random forest

Random forest algorithm avoids and prevents overfitting by using multiple trees. The results are not accurate. This gives accurate and precise results. Decision trees require low computation, thus reducing time to implement and carrying low accuracy.


random forests are a strong modeling technique and much more robust than a single decision tree. They aggregate many decision trees to limit overfitting as well as error due to bias and therefore yield useful results.


Advantages to using decision trees:

1. Easy to interpret and make for straightforward visualizations.

2. The internal workings are capable of being observed and thus make it possible to reproduce work.

3. Can handle both numerical and categorical data.

4. Perform well on large datasets

5. Are extremely fast


Ideally, we would like to minimize both error due to bias and error due to variance. Enter random forests. Random forests mitigate this problem well. A random forest is simply a collection of decision trees whose results are aggregated into one final result. Their ability to limit overfitting without substantially increasing error due to bias is why they are such powerful models.


One way Random Forests reduce variance is by training on different samples of the data. A second way is by using a random subset of features. This means if we have 30 features, random forests will only use a certain number of those features in each model, say five. Unfortunately, we have omitted 25 features that could be useful. But as stated, a random forest is a collection of decision trees. Thus, in each tree we can utilize five random features. If we use many trees in our forest, eventually many or all of our features will have been included. This inclusion of many features will help limit our error due to bias and error due to variance. If features weren’t chosen randomly, base trees in our forest could become highly correlated. This is because a few features could be particularly predictive and thus, the same features would be chosen in many of the base trees. If many of these trees included the same features we would not be combating error due to variance.


references:

https://towardsdatascience.com/decision-trees-and-random-forests-df0c3123f991

AI/ML functionalities of various gates in LSTM

the forget gate determines which relevant information from the prior steps is needed. The input gate decides what relevant information can be added from the current step, and the output gates finalize the next hidden state.


The forget gate decides which information needs attention and which can be ignored. The information from the current input X(t) and hidden state h(t-1) are passed through the sigmoid function. Sigmoid generates values between 0 and 1. It concludes whether the part of the old output is necessary (by giving the output closer to 1). This value of f(t) will later be used by the cell for point-by-point multiplication.


The input gate performs the following operations to update the cell status.


First, the current state X(t) and previously hidden state h(t-1) are passed into the second sigmoid function. The values are transformed between 0 (important) and 1 (not-important).


Next, the same information of the hidden state and current state will be passed through the tanh function. To regulate the network, the tanh operator will create a vector (C~(t) ) with all the possible values between -1 and 1. The output values generated form the activation functions are ready for point-by-point multiplication.


The output


First, the values of the current state and previous hidden state are passed into the third sigmoid function. Then the new cell state generated from the cell state is passed through the tanh function. Both these outputs are multiplied point-by-point. Based upon the final value, the network decides which information the hidden state should carry. This hidden state is used for prediction.



References:

https://www.pluralsight.com/guides/introduction-to-lstm-units-in-rnn

What are the uses of pooling layers in CNN

Pooling Layer The main purpose of pooling layer is to progressively reduce the spatial size of the input image, so that number of computations in the network are reduced. Pooling performs downsampling by reducing the size and sends only the important data to next layers in CNN.


References: 

https://www.researchgate.net/figure/Figurec-convolution-operation-B-Pooling-Layer-The-main-purpose-of-pooling-layer-is-to_fig2_332570921#:~:text=Pooling%20Layer%20The%20main%20purpose%20of%20pooling%20layer%20is%20to,to%20next%20layers%20in%20CNN.

AI/ML How to find size of output tensor after convolutions

Generally, in a Convolutional Neural Network, the input image undergoes multiple convolution operations, where each convolution operation might change the size of the input image. 

If mxn is the size of the input image tensor, kxk is the size of the convolution filter and s is the value of stride taken, then, the size of the resulting tensor, X x Y (after a series of convolution operation) can be found out using the following formula:


(X, Y) = (( m - k) / s ) + 1, (( n - k) / s ) + 1

references:

https://www.theclickreader.com/stride-and-calculation-of-output-size/ 

AI/ML What is learning rate

Learning rate (λ) is one such hyper-parameter that defines the adjustment in the weights of our network with respect to the loss gradient descent. It determines how fast or slow we will move towards the optimal weights


The Gradient Descent Algorithm estimates the weights of the model in many iterations by minimizing a cost function at every step.


Here is the algorithm

Repeat until convergence {

     Wj = Wj - λ θF(Wj)/θWj

}

Where:

Wj is the weight

θ is the theta

F(Wj) is the cost function respectively.

In order for Gradient Descent to work, we must set the learning rate to an appropriate value. This parameter determines how fast or slow we will move towards the optimal weights. If the learning rate is very large we will skip the optimal solution. If it is too small we will need too many iterations to converge to the best values. So using a good learning rate is crucial.


In simple language, we can define learning rate as how quickly our network abandons the concepts it has learned up until now for new ones.



References:

https://towardsdatascience.com/https-medium-com-dashingaditya-rakhecha-understanding-learning-rate-dd5da26bb6de

What is back propagation

Backpropagation is a strategy to compute the gradient in a neural network. The method that does the updates is the training algorithm. For example, Gradient Descent, Stochastic Gradient Descent, and Adaptive Moment Estimation.


Lastly, since backpropagation is a general technique for calculating the gradients, we can use it for any function, not just neural networks. Additionally, backpropagation isn’t restricted to feedforward networks. We can apply it to recurrent neural networks as well.


the difference between Feedforward Neural Networks and Backpropagation. The former term refers to a type of network without feedback connections forming closed loops. The latter is a way of computing the partial derivatives during training.

References:

https://www.baeldung.com/cs/neural-networks-backprop-vs-feedforward#:~:text=Lastly%2C%20since%20backpropagation%20is%20a,recurrent%20neural%20networks%20as%20well..

Saturday, October 1, 2022

Javascript Dynamic Import

Javascript Dynamic Import 

The import() call, commonly called dynamic import, is a function-like expression that allows loading an ECMAScript module asynchronously and dynamically into a potentially non-module environment.

(async () => {

  if (somethingIsTrue) {

    // import module for side effects

    await import("/modules/my-module.js");

  }

})();

Importing defaults

(async () => {

  if (somethingIsTrue) {

    const {

      default: myDefault,

      foo,

      bar,

    } = await import("/modules/my-module.js");

  }

})();


references:

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/import

AI/ML Difference between Dense and Activation layer in Keras

I was wondering what was the difference between Activation Layer and Dense layer in Keras.

Since Activation Layer seems to be a fully connected layer, and Dense have a parameter to pass an activation function, what is the best practice ?

Let's imagine a fictionnal network like this : Input -> Dense -> Dropout -> Final Layer Final Layer should be : Dense(activation=softmax) or Activation(softmax) ? What is the cleanest and why ?

Using Dense(activation=softmax) is computationally equivalent to first add Dense and then add Activation(softmax). However there is one advantage of the second approach - you could retrieve the outputs of the last layer (before activation) out of such defined model. In the first approach - it's impossible.


References:

https://stackoverflow.com/questions/40866124/difference-between-dense-and-activation-layer-in-keras#:~:text=Using%20Dense(activation%3Dsoftmax),the%20first%20approach%20%2D%20it's%20impossible.



AI/ML what is hot encoding?

One Hot Encoding is a common way of preprocessing categorical features for machine learning models. This type of encoding creates a new binary feature for each possible category and assigns a value of 1 to the feature of each sample that corresponds to its original category.

Limitations of One Hot Encoding

The operation of one-hot encoding categorical variables is actually a simple embedding where each category is mapped to a different vector. This process takes discrete entities and maps each observation to a vector of 0s and a single 1 signaling the specific category.


The one-hot encoding technique has two main drawbacks:


For high-cardinality variables — those with many unique categories — the dimensionality of the transformed vector becomes unmanageable.

The mapping is completely uninformed: “similar” categories are not placed closer to each other in embedding space.

The first problem is well-understood: for each additional category — referred to as an entity — we have to add another number to the one-hot encoded vector. If we have 37,000 books on Wikipedia, then representing these requires a 37,000-dimensional vector for each book, which makes training any machine learning model on this representation infeasible.


The second problem is equally limiting: one-hot encoding does not place similar entities closer to one another in vector space. If we measure similarity between vectors using the cosine distance, then after one-hot encoding, the similarity is 0 for every comparison between entities.


This means that entities such as War and Peace and Anna Karenina (both classic books by Leo Tolstoy) are no closer to one another than War and Peace is to The Hitchhiker’s Guide to the Galaxy if we use one-hot encoding.



referemces:

https://towardsdatascience.com/neural-network-embeddings-explained-4d028e6f0526

AI/ML what does embedding do in RNN?

In the context of neural networks, embeddings are low-dimensional, learned continuous vector representations of discrete variables. Neural network embeddings are useful because they can reduce the dimensionality of categorical variables and meaningfully represent categories in the transformed space.


One notably successful use of deep learning is embedding, a method used to represent discrete variables as continuous vectors. This technique has found practical applications with word embeddings for machine translation and entity embeddings for categorical variables.

Neural network embeddings have 3 primary purposes:

Finding nearest neighbors in the embedding space. These can be used to make recommendations based on user interests or cluster categories.

As input to a machine learning model for a supervised task.

For visualization of concepts and relations between categories.

This means in terms of the book project, using neural network embeddings, we can take all 37,000 book articles on Wikipedia and represent each one using only 50 numbers in a vector. Moreover, because embeddings are learned, books that are more similar in the context of our learning problem are closer to one another in the embedding space

references:

https://towardsdatascience.com/neural-network-embeddings-explained-4d028e6f0526


AI/ML What is POS Tagging

Part-of-speech (POS) tagging is a popular Natural Language Processing process which refers to categorizing words in a text (corpus) in correspondence with a particular part of speech, depending on the definition of the word and its context.


Part-of-speech tags describe the characteristic structure of lexical terms within a sentence or text, therefore, we can use them for making assumptions about semantics. Other applications of POS tagging include:


Named Entity Recognition

Co-reference Resolution

Speech Recognition


Taking the example, “Why not tell someone?”, imaging the sentence is truncated to “Why not tell … ” and we want to determine whether the following word in the sentence is a noun, verb, adverb, or some other part-of-speech.


Now, if you are familiar with English, you’d instantly identify the verb and assume that it is more likely the word is followed by a noun rather than another verb. Therefore, the idea as shown in this example is that the POS tag that is assigned to the next word is dependent on the POS tag of the previous word.


By associating numbers with each arrow direction, of which imply the likelihood of the next word given the current word, we can say there is a higher likelihood the next word in our sentence would be a noun since it has a higher likelihood than the next word being a verb if we are currently on a verb. 



References:

https://towardsdatascience.com/part-of-speech-tagging-for-beginners-3a0754b2ebba