Latest News


Data Warehouse Concepts

OBIEE Errors

What's New

OBIEE Performance Tips


Big Data

Natural Language Processing

Machine Learning

Latest News

Saturday, December 18, 2021

 Power Transform: 

Power transformation is used to map data from any distribution to close to Gaussian distribution, as normality of the features is necessary for many modeling scenarios. Also transformation of data is needed in order to stabilize variance and minimize skewness.

For instance, some algorithms perform better or converge faster when features are close to normally distributed.

·         linear and logistic regression

·         nearest neighbors

·         neural networks

·         support vector machines with radial bias kernel functions

·         principal components analysis

·         linear discriminant analysis

Power transformer class canbe accessed from sklearn.preprocessing package. Power Transformer provides two transformations Yeo-Johnson transform and Box-Cox transform.

The Yeo-Johnson transform is: 

The Box-Cox transform is:

Important Points:

Box-Cox can only be applied to positive data only.

Also both transformation is parameterized by  λͅ, which is determined through maximum likelihood estimation.


Friday, July 30, 2021

1.Login to Oracle fusion application

2. Go to Navigator and under configuration choose Sandboxes

3.Click on 'Create sandbox ' and provide the name. Make sure there is no space in the given sandbox name.

4.Select Appearance Tool

5.Click on create and enter

6.A yellow pop-up bar will be visible. Click on Tools and select appearance

7. You can there is 'Logo' .select File and upload your logo image.

8.Click on actions and save as . 

9. Your updated logo will be displayed 

10. Click on Apply button.

11.after that click on your sandbox name "LogoChange" on the left-hand side and click on publish. 

12.Then it will navigate to the sandboxes page. Click on publish and done.

13. Your logo will be changed.

Sunday, June 13, 2021

The process of transferring an image's aesthetic style to another is known as neural style transfer.  The algorithm takes three images: an input image, a content picture, and a style picture, and modifies the input to match the content of the content image and the artistic style of the style image.

The Basic Principle behind Neural Style Transfer

The basic idea behind neural style transfer is to establish two distance functions: one to describe how different the content of two images is, Lcontent, and another to characterize the difference in style between the two images, Lstyle.Then, given three images: the desired style image, the desired content picture, and the input picture (initialized with the content image), we strive to alter the input image so that its content distant from the content picture and its style distance from the style image is as small as possible.m takes three images: an input image, a content picture, and a style picture, and modifies the input to match the content of the content image and the artistic style of the style image.

Content Image

Style Image

Importing Packages and Selecting a Device

Below is a list of the packages needed to implement the neural transfer

  • torch, torch.nn, numpy (indispensables packages for neural networks with PyTorch)
  • torch.optim (efficient gradient descents)
  • PIL, PIL.Image, matplotlib.pyplot (load and display images)
  • torchvision.transforms (transform PIL images into tensors)
  • torchvision.models (train or load pre-trained models)
  • copy (to deep copy the models; system package)

General steps to perform style transfer:

  1. Visualize data
  2. Basic Preprocessing/preparing our data
  3. Set up loss functions
  4. Create model
  5. Optimize for loss function


You can find the complete code for this article in the given URL below.


Wednesday, May 26, 2021

 What is List Comprehension?

List comprehensions are a quick and easy way to make lists It  is one of Python's most important features. List comprehensions are used for creating new lists from other iterable, or to create a subsequence of those elements that satisfy a certain condition.


We want a list that will have all characters of a string . Using List comprehension we can write as below:

We can solve the same problem using the traditional way as below:

Hacker Rank Problem:

Hacker Rank Solution:


 What is Task Orchestration Tool

Cleaning data, training machine learning models, monitoring performance, and deploying the models to a production server are common tasks for smaller teams to begin with. The number of repetitive steps increases as the team and solution expand in size. It becomes much more important that these activities are completed in a timely manner.

The degree to which these activities are interdependent grows as well. You will have a pipeline of activities that need to be run once a week or once a month when you first start out. These tasks must be completed in the correct order. This pipeline evolves into a network of dynamic branches as you expand. In several cases, some tasks trigger the execution of others, which may be dependent on the completion of some other tasks first.

This network can be represented as a DAG (Directed Acyclic Graph), which represents each task and its interdependencies.

Pipeline      Credit: Google Image

DAG   Credit:Google Image

There has been a recent proliferation of new tools for orchestrating task- and data workflows (also known as "MLOps"). Since the sheer number of these tools makes it difficult to determine which to use and how they interact, we decided to pit some of the most common against one another.

Source: Google Image

It is clear that the most common solution is airflow, followed by Luigi. There are also newer candidates, all of whom are rapidly expanding.

Comparison Table







Apache Airflow






























While each of these techniques has its own set of strengths and weaknesses, none of them can guarantee a pain-free procedure right out of the box. Before you start worrying about the tool to use, make sure you have strong processes in place, such as positive team culture, blame-free retrospectives, and long-term goals.

Friday, May 21, 2021

MLOps is a DevOps extension in which the DevOps principles are applied to machine learning pipelines. Creating a machine learning pipeline differs from creating software, primarily due to the data aspect. The model's quality is determined by more than just the code's quality.

It is also determined by the quality of the data — i.e. the features — used to run the model. According to Airbnb, data scientists spend roughly 60% to 80% of their time creating, training, and testing data. Feature stores allow data scientists to reuse features rather than rebuilding them for each new model, saving valuable time and effort. Feature stores automate this process and can be triggered by Git-pushed code changes or the arrival of new data. This automated feature engineering is a crucial component of the MLOps concept.

ML Ops is the intersection of Machine Learning, DevOps and Data Engineering

Photo by Kevin Ku from Pexels

The process of creating features is known as feature engineering, and it is a complex but essential component of any machine learning process. Better features equal better models, which equals a better business outcome.

To generate a new feature requires enormous work, and building the feature pipeline is only one thing. You probably had a long trial and error process, with a large number of characteristics, to get to the point of being pleased with your unique new feature. Next, the operational pipelines needed to be calculated and stored, which then differs depending on whether or not the features are online or offline.

In addition, every data science project begins with the search for the right functionality. The problem is, that there is mostly no unique, centralized location for searches; there are features everywhere.

The Feature Store is not only a data layer, it also allows users to manipulate raw data and store them as features that are ready for use in any type of Learning Machine Model.

There are two types of features that is online and offline:

Offline Features: Many of the features are calculated offline as part of a batch job. As an example, consider the average monthly spend of a customer. They are mostly used by offline processes. Because these types of computations can take a long time, they are calculated using frameworks such as Spark or by simply running complex SQL queries against a set of databases and then using a batch inference process.

Data preparation pipelines push data into the Feature Store tables and training data repositories.

Online Features: These features are a little more complicated because they must be calculated quickly and are frequently served in milliseconds. Calculating a z-score, for example, for real-time fraud detection. In this case, the pipeline is built in real time by calculating the mean and standard deviation over a sliding window. These calculations are much more difficult, necessitating quick computation as well as quick access to the data. The information can be kept in memory or in a very fast key-value database. The process itself can be carried out on various cloud services or on a platform such as the Iguazio Data Science Platform, which includes all of these components as part of its core offering.

Model training jobs use Feature Store and training data repository data sets to train models and then push them to the model repository.

Advantages of Feature Store:

  • Faster development
  • Smooth model deployment in production
  • Increased model accuracy
  • Better collaboration
  • Track lineage and address regulatory compliance

Ads Place 970 X 90

Big Data Concepts

Error and Resolutions