Trending December 2023 # Guidelines For Creating And Using Cte With Synttax # Suggested January 2024 # Top 15 Popular

You are reading the article Guidelines For Creating And Using Cte With Synttax updated in December 2023 on the website We hope that the information we have shared is helpful to you. If you find the content interesting and meaningful, please share it with your friends and continue to follow and support us for the latest updates. Suggested January 2024 Guidelines For Creating And Using Cte With Synttax

Introduction to SQL CTE

Common table expression (CTE) was introduced in the SQL server 2005 and it is a temporary named result set. It is characterized by a simple query and defined within the execution scope of a single SELECT, INSERT, UPDATE, DELETE or MERGE statement. We can also use CTE for CREATE VIEW statement which is a subset of the SELECT statement. A typical CTE articulation can incorporate references to itself.

Start Your Free Data Science Course

The following shows the common syntax of a CTE in SQL Server:

WITH expression_name[(column_name [,...])] AS (CTE_definition) SQL_statement;

In the above expression:

Step one, we need to specify (expression_name) the name of the expression to which will be later used by the query. Expression name is similar to a table name when we create a table.

Next, after expression_name we determine a rundown of column name which is separated by a comma and the definition. The count of columns in expression_name and count of columns defined in the CTE_definition must be equal.

After that, we use the keyword “AS” before CTE_definition and if only expression name is given then all columns from the CTE_definition will be added.

Subsequently, make a definition of a SELECT statement whose result set is passed the CTE.

At last, refer to the CTE in a query such as INSERT, SELECT, DELETE, UPDATE, or MERGE.

Guidelines for Creating and Using CTE

After declaring CTE there should be a single SELECT, UPDATE, INSERT or DELETE and it can also be specified as part of CREATE VIEW statement.

A non-recursive CTE can define query definitions with multiple CTE and the result should be merged by one of these operators: UNION ALL, UNION, INTERSECT, or EXCEPT.

A reference to the CTE can be made within the same WITH clause.

Indicating more than one WITH condition in a CTE isn’t permitted. For instance, if a CTE_query_definition contains a subquery, that subquery can’t contain a settled WITH statement that characterizes another CTE. You cannot use following clauses in CTE_definition





On the off chance that CTE is a part of a batch, then there should be a semicolon at the end

A cursor can be defined by referring to a CTE.

A CTE can reference tables from the remote servers.

Guidelines for Creating and Using Recursive CTE

A definition of recursive CTE must consist of at least two CTE query definitions, an anchor member, and a recursive member. Anchor member query must be before recursive members.

The anchor and the recursive members must have the same number of columns.

Common Table Expression in SQL Examples

Starting with some basic examples of using Common table expressions:

Example #1 – Simple CTE Example in SQL Server

CTE is used by this query to return Department id


WITH cte_dept AS ( select * from HumanResources.Department ) select DepartmentID from cte_dept


Explanation: This is the most basic example in which step one is to define cte_dept as the name of the common table expression. This CTE returns the result which consists of all the columns of the Department table from the query definition as we have not specified any columns.

Example #2 – Simple SQL Server CTE Example with column names


WITH cte_deptName(DepartmentName,DepartmentGroupName) AS ( select Name,GroupName from HumanResources.Department ) select DepartmentName from cte_deptName


Explanation: First, we defined cte_deptName as the name of the CTE. This CTE returns the result which consists of all the columns of the Department table from the query definition and we have specified the column names.

Second, we created a query that gives Name, GroupName columns from the Department table. The third step is to use the CTE in the outer query and only select DepartmentName from the CTE.

Example #3 – Using multiple SQL Server CTE in a single query


WITH cte_deptName(DepartmentID,DepartmentName) AS ( select DepartmentID,Name from HumanResources.Department ), cte_deptGroup(DepartmentID,DepartmentGroupName) AS ( Select DepartmentID,GroupName from HumanResources.Department ) Select DepartmentName from cte_deptNamed N INNERJOIN cte_deptGroup dG ON dN.DepartmentID=dG.DepartmentID

Explanation: First, we defined cte_deptName as the name of the common table expression. This CTE returns the result which consists of all the columns of the Department table from the query definition and we have specified the column names and then we have similarly defined cte_deptGroupwhich as a comma-separated

Second, we created a query whichgivesDepartmentID, Name columns from the Department table for first CTE and DepartmentID, GroupName for second CTE

The third step is to use the CTE in the outer query and only selectDepartmentNamefrom the inner join of CTE using DepartmentID.

Why Do you Need CTE?

There e several other methods like creating views, temporary tables, or derived tables to achieve the same result then why do we use CTE? There are several reasons and some of them are as follows:

Readability: CTE enhances readability as rather than lumping all the logic of query into one huge query, we create several CTE’s which is later combined in a statement. So, all the chunks of data can be combined into the final select statement.

Substitute for a View: Views can be substituted for a CTE. There may be several reasons for this like for example you do not have the permission to create a view or you have to use it just one time and do not want to save it for later use.

Recursion: CTE’s can be used to perform recursive queries. These are the queries that call themselves and can be used in hierarchical data such as organizational chart.

Limitations: Since SELECT statement cannot reference itself or do GROUP BY using non-deterministic functions that are overcome by CTE.

Recommended Articles

We hope that this EDUCBA information on “SQL CTE” was beneficial to you. You can view EDUCBA’s recommended articles for more information.

You're reading Guidelines For Creating And Using Cte With Synttax

Guidelines For 2023 Boston Marathon Security

Recent Terrorism Informs Security for Boston Marathon No drones, backpacks, coolers, large bags, or blankets

The Federal Bureau of Investigation, informed by recent terrorist attacks in Paris, San Bernardino, Calif., and Brussels, has put in place a series of security measures to prevent a similar attack during Monday’s Boston Marathon.

The FBI did not detail those measures in its announcement earlier this month. The bureau said there have been no credible threats against the Marathon, but that officials are preparing for what they don’t know.

In addition, police have established a “no-drone zone” along the race route and are asking spectators and runners to leave those aerial video devices at home.

Otherwise, “most of the security aspects of the Marathon are similar to what we experienced last year,” says BU Police Captain Robert Molloy. Security was fortified after the 2013 bombings that killed three, among them graduate student Lu Lingzi (GRS’13), and injured more than 260.

As with last year, race officials ask spectators not to bring backpacks, shoulder bags, blankets, larger packages, coolers, glass containers, and some other items with them. Find a full list of such items here.

“Spectators approaching viewing areas on the course, or in viewing areas on the course, may be asked to pass through security checkpoints, and law enforcement officers or contracted private security personnel may ask to inspect bags and other items being carried,” according to the Massachusetts Emergency Management Agency (MEMA). The agency encourages bystanders to carry permissible personal items in a clear plastic bag to “enhance public safety and speed security screening.” MEMA also offers Massachusetts Alerts, a free, downloadable app for Android and iPhones—runners’ and spectators’—that can receive emergency and public safety information. Find information about the app here.

As in recent years, “bandits,” or unauthorized, unregistered runners, are not permitted to join the race and will be stopped by security personnel.

Almost 5,000 law enforcement officers will be deployed along the 26.2-mile course, including about 15 from the BUPD, which is primarily assigned to secure the area around Audubon Circle. “That has the greatest concentration of BU students” watching the race, Molloy says. As they did last year, University officers will join their Boston and Brookline counterparts in bicycle and motorcycle units.

In a Boston Globe interview after last month’s attacks in Brussels, William Evans, Boston police commissioner, outlined security steps taken since the 2013 Marathon bombings: “We have more cameras out there. We have more tactical units. Officers are undercover working the crowd. We’re constantly monitoring events as they play out, not only across the country, but internationally. We’re going to be on our toes watching for any intelligence that might lead us to someone who might try to jeopardize the Marathon.”

“We know that some students choose to celebrate Marathon Monday by drinking,” says Katharine Mooney (SPH’12), director of SHS Wellness & Prevention Services. “It’s important to share alcohol safety information with these students and others who aren’t drinking, so that everyone is well informed about resources if they see someone they’re concerned about.”

A BU scholarship fund in Lu Lingzi’s memory has raised more than $1 million in gifts and pledges to support international graduate students; contributions may be made here. The Boston University Chinese Students & Scholars Association will sponsor a photo exhibition and silent auction to benefit the fund on Thursday, April 21, from noon until 7 p.m., at the BU Alley in the George Sherman Union basement. Photos, most by BU students, will be auctioned off (the mandatory minimum bid is $15).

The city also will observe its second annual One Boston Day today, April 15, commemorating community response to the April 15, 2013, bombings. The day features philanthropic activities by businesses and nonprofits.

Explore Related Topics:

Datagen And Creating Smarter Ais With Synthetic Data

One of the initial tasks artificial intelligence (AI) failed miserably at was facial recognition. It was so bad that it has created a significant grassroots effort to block all facial recognition, and IBM, which pioneered facial recognition, exited that part of the AI market. 

At the core of the problem was biased data sets that had unacceptable issues with minorities and women.  

We’ve learned from companies like NVIDIA that are aggressively using simulation to train self-driving cars and robotics that using simulation and related training at machine speeds can increase the accuracy of autonomous machine training significantly. 

I met recently with a company called Datagen that uses synthetic people to create unbiased facial recognition programs and potentially make metaverse-based collaboration systems more effective.

Let’s explore the use of synthetic people to improve AI accuracy and create the next generation collaboration platforms:  

Clearly, we now know that biased data sets lead to embarrassingly inaccurate AIs, suggesting that market researchers who are trained to identify and eliminate bias as a matter of practice should have been brought in to create practices that would lead to less biased data sets. Using live data, it is virtually impossible to eliminate all bias, without making the data sets so large that they become unmanageable.

You can also run the synthetic data set against real data to both look for bias in the real data set and any unintended bias in the synthetic training set. With synthetic data, you can also use the result, without privacy violations, for a variety of other functions. These would include broader metaverse efforts where realistic artificial people will enhance the apparent reality of the related simulation. Say, for instance, you wanted to showcase the light coverage in the interior of a building once it was occupied. Using real images would create licensing and privacy issues, whereas using synthetic images derived from a variety of people should not.  

This synthetic data doesn’t have to just apply to people either. 

It can be used in security systems in stores to identify shoplifting or help with automated checkout, improve hand tracking for virtual reality (VR) solutions, simulate uses for planned buildings to remove inefficiencies long before construction starts, and improve body tracking accuracy for everything from protecting drivers to improving marketing programs.  

And it would be very handy for home security, in terms of identifying packages and better alerting to porch pirates. It can even be used to help with facial reconstruction after an accident, but one of the most interesting applications is with collaboration products.

Meta is aggressively looking at metaverse-based collaboration where you are represented by an avatar. Avatars, though, can look more like cartoons than people. You can’t use the video image of someone because, in this implementation, most are wearing VR glasses, which are off-putting to everyone in the conversation. What you need is a level of accuracy more like deep fakes, where you look like you and your body and facial expressions appear realistically on your avatar.  

Datagen demonstrated a far more realistic avatar technology using its computer vision algorithms coupled with eye, face, and body tracking, with a particular focus on hand tracking.  

With Datagen’s technology, you shouldn’t need to use a controller, as your hand is your controller. Instead of floating around legless, your entire body is rendered in a more photorealistic way. While Datagen’s current capability is far better than some alternatives, it is still on the wrong side of the uncanny valley in my opinion. But it should improve sharply over time to the point where you can’t tell the difference between an avatar and a real person. This would allow you to freeze your appearance at your favorite age and dress digitally and be able to attend meetings in your pajamas if you want and still look like you are professionally dressed during a remote video call.  

Our future automation efforts will depend on us getting this right and correcting the current lack of trust for facial recognition solutions. Datagen has a set of tools that could massively increase this accuracy and benefit efforts that include far more viable metaverse-based collaboration and communications.  

While young, Datagen appears to be at the forefront of improving computer vision substantially and building future tools that will help us create stronger AIs and a far more accurate metaverse.  

Creating Continuous Action Bot Using Deep Reinforcement Learning

Now, we move on to the crux Actor critic method. The original paper explains quite well how it works but here is a rough idea. The actor takes a decision based on a policy, critic evaluates state-action pair and give it a Q value. If the state-action pair is good according to critics it will have a higher Q value and vice versa.

Now we move to actor-network, we created a similar network but here are some key points which you must remember while making the actor.

Actor-Network class ActorNetwork(nn.Module): def __init__(self, alpha): super(ActorNetwork, self).__init__() self.input_dims = 2 self.fc1_dims = fc1_dims self.fc2_dims = fc2_dims self.n_actions = 2 chúng tôi = nn.Linear(self.input_dims, self.fc1_dims) chúng tôi = nn.Linear(self.fc1_dims, self.fc2_dims) chúng tôi = nn.Linear(self.fc2_dims, self.n_actions) self.optimizer = optim.Adam(self.parameters(), lr=alpha) self.device = T.device('cuda' if T.cuda.is_available() else 'cpu') def forward(self, state): prob = self.fc1(state) prob = F.relu(prob) prob = self.fc2(prob) prob = F.relu(prob) #fixing each agent between 0 and 1 and transforming each action in env mu = T.sigmoid( return mu

Note: We used 2 hidden layers since our action space was small and environment not very complex. Authors used 400 and 300 neurons for 2 hidden layers.

Just like gym env, the agent has some conditions too. We initialized our target networks with the same weights as our original (A-C) networks. Since we are chasing a moving target, target networks create stability and help original networks to train.

We initialize with all the requirements, as you might have noticed we have one loss function parameter too. We can use different loss functions and choose whichever works best for us (usually L1 smooth loss), paper had used mse loss, so we will go ahead and use it as default.

Here we include choosing action function, you can create an evaluation function as well, which outputs action space without noise. A remembering function (just as cover) to store it in our memory.

Update parameter function, now this is where we do soft(target networks) and hard updates(original networks). Now here it takes only one parameter Tau, this is similar to how we think a learning rate is.

It is used to soft update our target networks and in the paper, they found the best tau to be 0.001 and it usually is best across different papers (you can try and play with it).

class Agent(object): def __init__(self, alpha, beta, input_dims=2, tau, env, gamma=0.99, n_actions=2, max_size=1000000, batch_size=64): self.gamma = gamma chúng tôi = tau self.memory = ReplayBuffer(max_size) self.batch_size = batch_size = ActorNetwork(alpha) self.critic = CriticNetwork(beta) self.target_actor = ActorNetwork(alpha) self.target_critic = CriticNetwork(beta) self.scale = 1.0 self.noise = np.random.normal(scale=self.scale,size=(n_actions)) self.update_network_parameters(tau=1) def choose_action(self, observation): observation = T.tensor(observation, dtype=T.float).to( mu = mu_prime = mu + T.tensor(self.noise(), dtype=T.float).to( return mu_prime.cpu().detach().numpy() def remember(self, state, action, reward, new_state, done): self.memory.store_transition(state, action, reward, new_state, done) def learn(self): if self.memory.mem_cntr < self.batch_size: return state, action, reward, new_state, done = self.memory.sample_buffer(self.batch_size) reward = T.tensor(reward, dtype=T.float).to(self.critic.device) done = T.tensor(done).to(self.critic.device) new_state = T.tensor(new_state, dtype=T.float).to(self.critic.device) action = T.tensor(action, dtype=T.float).to(self.critic.device) state = T.tensor(state, dtype=T.float).to(self.critic.device) self.target_actor.eval() self.target_critic.eval() self.critic.eval() target_actions = self.target_actor.forward(new_state) critic_value_ = self.target_critic.forward(new_state, target_actions) critic_value = self.critic.forward(state, action) target = [] for j in range(self.batch_size): target.append(reward[j] + self.gamma*critic_value_[j]*done[j]) target = T.tensor(target).to(self.critic.device) target = target.view(self.batch_size, 1) self.critic.train() self.critic.optimizer.zero_grad() critic_loss = F.mse_loss(target, critic_value) critic_loss.backward() self.critic.optimizer.step() self.critic.eval() mu = actor_loss = -self.critic.forward(state, mu) actor_loss = T.mean(actor_loss) actor_loss.backward() self.update_network_parameters() def update_network_parameters(self, tau=None): if tau is None: tau = self.tau actor_params = critic_params = self.critic.named_parameters() target_actor_params = self.target_actor.named_parameters() target_critic_params = self.target_critic.named_parameters() critic_state_dict = dict(critic_params) actor_state_dict = dict(actor_params) target_critic_dict = dict(target_critic_params) target_actor_dict = dict(target_actor_params) for name in critic_state_dict: critic_state_dict[name] = tau*critic_state_dict[name].clone() + (1-tau)*target_critic_dict[name].clone() self.target_critic.load_state_dict(critic_state_dict) for name in actor_state_dict: actor_state_dict[name] = tau*actor_state_dict[name].clone() + (1-tau)*target_actor_dict[name].clone() self.target_actor.load_state_dict(actor_state_dict)

The most crucial part is the learning function. First, we feed the network with samples until it reaches batch size and then start sampling from batches to update our networks. Calculate critic and actor losses. Then just soft update all the parameters.

env = OurCustomEnv(sales_function, obs_range, act_range) agent = Agent(alpha=0.000025, beta=0.00025, tau=0.001, env=env, batch_size=64, n_actions=2) score_history = [] for i in range(10000): obs = env.reset() done = False score = 0 while not done: act = agent.choose_action(obs) new_state, reward, done, info = env.step(act) agent.remember(obs, act, reward, new_state, int(done)) agent.learn() score += reward obs = new_state score_history.append(score) Results

Just in few minutes, we have training results ready. Agent exhausts almost full budget and we have a graph during training –

These results can be achieved even faster if we make changes in hyperparameters and reward functions.

Also thanks to Phil and Andrej Karpathy for their marvellous work.

The media shown in this article are not owned by Analytics Vidhya and are used at the Author’s discretion.

Creating Linear Model, It’s Equation And Visualization For Analysis

This article was published as a part of the Data Science Blogathon.


Linear Regression:

Fig. 1.0: The Basic Linear Regression model Visualization

The Linear model (Linear Regression) was probably the first model you learned and created, using the model to predict the Target’s continuous values. You sure must have been happy that you’ve completed a model. You were probably also taught the theories behind its functionality– The Empirical Risks Minimization, The Mean Squared Loss, The Gradient descent, The Learning Rate among others.

Well, this is great and all of a sudden I was called to explain a model I created to the manager, all those terms were like jargons to him, and when he asked for the model visualization (as in fig 1.0) that is the model fit hyperplane(the red line) and the data points(the blue dots). I froze to my toes not knowing how to create that in python code.

Well, That’s what the first part of this article is about Creating the Basic Linear Model Visualization in your Jupyter notebook in Python.

Let’s begin using this random data:

X y

1 2

2 3

3 11

4 13

5 28

6 32

7 50

8 59

9 85

Method 1: Manual Formulation Importing our library and creating the Dataframe:

now at this stage, there are two ways to perform this visualization:

1.) Using  Mathematical knowledge

2.) Using the Linear_regression Attribute for scikit learns Linear_model.

Let’s get started with the Math😥😥.

just follow through it’s not that difficult, First we define the equation for a linear relationship between y(dependent variables/target) and X(independent variable/features) as :

         y = mX + c

where y = Target

            X = features

           a = slope

           b = y-intercept constant

To create the model’s equation we have to get the value of m and c , we can get this from the Y and X with the equations below:

The slope, a is interpreted as the product between the summation of the difference between each individual x value and its mean and the summation of the difference between each individual y point and its mean then divided by the summation of the square of each individual x and its mean.

The intercept is simply the mean of y  minus the product of the slope and mean of x

That is a lot to take in. probably read it over and over  till you get it, try reading with the picture

👆👆 that was the only challenge; if you’ve understood it congratulations let’s move on.

Now writing this in python code is ‘eazy-pizzy’  using the numpy library, check it out👇👇.

To blow your mind now, did you know that this is the model’s equation. and we just created a model without using scikit learn. we will confirm it now using the second method which is the scikit learn Linear Regression package

Method 2: Using scikit-learn’s Linear regression

 We’ll be importing Linear regression from scikit learn, fit the data on the model then confirming the slope and the intercept. The steps are in the image below.

so you can see that there is almost no difference, now let us visualize this as in fig 1.


The red line is our line of best fit that will be used for the prediction and the blue point are our initial data. With this, I had something to report back to the manager. I particularly did this for each feature with the target to add more insight.

Now we have achieved our goal of creating a model and showing its plotted graph

This technique may be time-consuming when it comes to data with larger sizes and should only be used when visualizing your line of best fit with a particular feature for analysis purposes. It is not really necessary during modeling unless requested, Errors and calculation may suck up your time and computation resources especially if you’re working with 3D and higher data. But the insight gotten is worth it.

I hope you enjoyed the article if yes that great, you can also tell me how to improve in any way. I still have a lot to share especially on regressions(Linear, Logistics, and Polynomial ).

Thank You for reading through.


Sql For Beginners And Analysts – Get Started With Sql Using Python


SQL is a mandatory language every analyst and data science professional should know

Learn about the basics of SQL here, including how to work with SQLite databases using Python

SQLite – The Lightweight and Quick Response Database!

SQL is a language every analyst and data scientist should know. There’s no escaping from this. You will be peppered with SQL questions in your analytics or data science interview rounds, especially if you’re a fresher in this field.

If you’ve been putting off learning SQL recently, it’s time to get into action and start getting your hands dirty. You would have to learn about databases to work with data so why not start your SQL journey today?

I’ve personally been working with SQL for a while and can attest to how useful it is, especially in these golden data-driven times. SQL is a simple yet powerful language that helps us manage and query data directly from a database, without having to copy it first.

It is also very easy to understand because of the various clauses that are similar to those used in the English language. So writing SQL commands will be a piece of cake for you!

And given the proliferation of data all over the world, every business is looking for professionals who are proficient in SQL. So once you add SQL skill to your resume, you will be a hotshot commodity out in the market. Great, but where to begin?

There are many different database systems out there, but the simplest and easiest to work with is SQLite. It is fast, compact, and stores data in an easy to share file format. It is used inside countless mobile phones, computers, and various other applications used by people every day. And the most amazing part, it comes bundled with Python! Heck, there is a reason why giants like Facebook, Google, Dropbox, and others use SQLite!

In this article, we will explore how to work with databases in Python using SQLite and look into the most commonly used SQL commands. So let’s start by asking the very basic question – what on earth is a database?

Table of Contents

What is a Database?

What is SQL?

Why Should you use SQLite?

Connecting to an SQLite Database

Creating tables using SQL

Inserting values in a table using SQL

Fetching records from a table using SQL

Loading a Pandas DataFrame into SQLite Database

Reading an SQLite Database into a Pandas DataFrame

Querying SQLite Database

Where clause

Group By clause

Order By clause

Having clause

Join clause

Update statement

Delete statement

Drop Table statement

What is a Database?

A database is an organized collection of interrelated data stored in an electronic format.

While there are various types of databases and their choice of usage varies from organization to organization, the most basic and widely used is the Relational Database model. It organizes the data into tables where each row holds a record and is called a tuple. And each column represents an attribute for which each record usually holds a value.

A Relational database breaks down different aspects of a problem into different tables so that storing them and manipulating them becomes an easy task. For example, an e-commerce website maintaining a separate table for products and customers will find it more useful for doing analytics than saving all of the information in the same table.

Database Management System (DBMS) is a software that facilitates users and different applications to store, retrieve, and manipulate data in a database. Relational Database Management System or RDBMS is a DBMS for relational databases. There are many RDBMS like MYSQL, Postgres, SQL Server, etc. which use SQL for accessing the database.

What is SQL?

But wait – we’ve been hearing the word ‘SQL’ since the beginner. What in the world is SQL?

SQL stands for Structured Query Language. It is a querying language designed for accessing and manipulating information from RDBMS.

SQL lets us write queries or sets of instructions to either create a new table, manipulate data or query on the stored data. Being a data scientist, it becomes imperative for you to know the basics of SQL to work your way around databases because you can only perform analysis if you can retrieve data from your organization’s database!

Why Should you use SQLite?

SQLite stores data in variable-length records which requires less memory and makes it run faster. It is designed for improved performance, reduced cost, and optimized for concurrency.

The sqlite3 module facilitates the use of SQLite databases with Python. In this article, I will show you how to work with an SQLite database in Python. You don’t need to download SQLite as it is shipped by default along with Python version 2.5 onwards!

Connecting to an SQLite Database

The first step to working with your database is to create a connection with it. We can do this by using the connect() method that returns a Connection object. It accepts a path to the existing database. If no database exists, it will create a new database on the given path.

The next step is to generate a Cursor object using the cursor() method which allows you to execute queries against a database:

View the code on Gist.

You are now ready to execute queries against the database and manipulate the data. But after we have done that, it is very important to do two things:

Commit/save the operations that we performed on the database using the commit() method. If we don’t commit our queries, then any changes we made to the database will not be saved automatically

Close the connection to the database to prevent the SQLite database from getting locked. When an SQLite database is locked, it will not be accessible by other users and will give an error

View the code on Gist.

Creating tables using SQL

Now that we have created a database, it is time to create a table to store values.

Let’s create a table that stores values for a customer of an e-commerce website. It stores values like customer name, the id of the product bought, name, gender, age, and the city the customer is from.

A table in SQL is created using the CREATE TABLE command. Here I am going to create a table called Customer with the following attributes:

User_ID – Id to identify individual customers. This is an Integer data type, Primary key and is defined as Not Null

The Primary key is an attribute or set of attributes that can determine individual records in a table. 

Defining an attribute Not Null will make sure there is a value given to the attribute (otherwise it will give an error).

Product_ID – Id to identify the product that the customer bought. Also defined as Not Null

Name – Name of a customer of Text type

Gender – Gender of a customer of Integer type

Age – Age of the customer of Integer type

Any SQL command can be executed using the execute() method of the Cursor object. You just need to write your query inside quotes and you may choose to include a ; which is a requirement in some databases but not in SQLite. But it is always good practice so I will include it with my commands.

So, using the execute() method, we can create our table as shown here:

View the code on Gist.

Perfect! Now that we have our table, let’s add some values to it.

Inserting values in a SQL table




A database table is of no use without values. So, we can use the INSERT INTO SQL command to add values to the table. The syntax for the command is as follows:

INSERT INTO table_name (column1, column2, column3, …)

VALUES (value1, value2, value3, …);

But if we are adding values for all the columns in the table, we can just simplify things and get rid of the column names in the SQL statement:

INSERT INTO table_name

VALUES (value1, value2, value3, …);

Like I said before, we can execute SQL statements using the execute() method. So let’s do that!

View the code on Gist.

What if we want to write multiple Insert commands in a single go? We could use the executescript() method instead:

View the code on Gist.

Or just simply use the executemany() method without having to repeatedly write the Insert Into command every time! executemany() actually executes an SQL command using an iterator to yield the values:

View the code on Gist.

These methods are not limited to the Insert Into command and can be used to execute any SQL statement.

Now that we have a few values in our table, let’s try to fetch those values from the database.

Fetching Records from a SQL table



For fetching values from the database, we use the SELECT command and the attribute values we want to retrieve:

SELECT column1, column2, … FROM table_name;

If you instead wanted to fetch values for all the attributes in the table, use the * character instead of the column names:

SELECT * FROM table_name;

To fetch only a single record from the database, we can use the fetchone() method:

To fetch multiple rows, you can execute a SELECT statement and iterate over it directly using only a single call on the Cursor object:

But a better way of retrieving multiple records would be to use the fetchall() method which returns all the records in a list format:

Awesome! We now know how to insert values into a table and fetch those values. But since data scientists love working with Pandas dataframes, wouldn’t it be great to somehow load the values from the database directly into a dataframe?

Yes, there is and I am going to show you how to do that. But first, I am going to show you how to store your Pandas dataframe into a database, which is obviously a better way to store your data!

Loading Pandas DataFrame into SQLite database

Pandas let us quickly write our data from a dataframe into a database using the to_sql() method. The method takes the table name and Connection object as its arguments.

I will use the dataframes from the Food Demand Forecasting hackathon on the DataHack platform which has three dataframes: order information, meal information, and center fulfillment information.

View the code on Gist.

We now have three tables in the database for each dataframe. It is easy to check them using the read_sql_query() method which we will explore in the next section where we will see how to load a database into a Pandas dataframe.

Reading an SQLite Database into a Pandas DataFrame

The read_sql_query() method of the Pandas library returns a DataFrame corresponding to the result of an SQL query. It takes as an argument the query and the Connection object to the database.

We can check the values in the tables using the real_sql_query() method:

View the code on chúng tôi the code on chúng tôi the code on Gist.

Perfect! Now let’s try to run some queries on these tables and understand a few important SQL commands that will come in handy when we try to analyze data from the database.

Querying our SQLite Database Where clause

The first important clause is the WHERE clause. It is used to filter the records based on a condition:

SELECT column1, column2, … FROM table_name

WHERE condition;

We can always use the * character if we want to retrieve values for all the columns in the table.

We can use it to query and retrieve only the Indian cuisine meals from the meal table:

Here, we have retrieved all the 12 records that matched our given condition. But what if we only wanted to retrieve the top 5 records that satisfy our condition? Well, we could use the LIMIT clause in that case.

LIMIT clause returns only the specified number of records and is useful when there are a large number of records in the table.

Here, we returned only the top 5 records from those that matched our given condition.

Group By statement

GROUP BY statement separates rows into different groups based on an attribute and can be used to apply an aggregate function (COUNT, MIN, MAX, SUM) on the resultant groups:

SELECT column1, column2, … FROM table_name

GROUP BY column_name;

We can use the GROUP BY statement to compare the number of orders for meals that received email promotions to those that did not. We will group the records on the emailer_for_promotion column and apply the COUNT aggregate function on the id column since it contains unique values. This will return the total number of rows belonging to each group:

Here we can see that there were more orders for meals that did not have an email promotion. But if we want to order our result, we can use the ORDER BY statement.

Order By clause

ORDER BY clause is used to sort the result into ascending or descending order using the keywords ASC or DESC respectively. By default, it sorts the records in ascending order:

SELECT column1, column2, … FROM table_name

Here I have combined two clauses: Group By and Order By. The Group By clause groups the values based on the email_for_promotion attribute and the Order By attribute orders output based on the count of the rows in each group. We can combine a bunch of clauses to extract more precise information from the database.

To sort the result in descending order, just type in the keyword DESC:

Having clause

The HAVING clause is used to query on the results of another query run on the database. It applies a filter on the groups returned by a previous query. It should not be confused with the WHERE clause that applies the filter condition before grouping.

HAVING is used to filter records after grouping. Hence, the HAVING clause is always used after the GROUP BY statement:

SELECT column1, column2, … FROM table_name

GROUP BY column_name

HAVING condition;

Here, we returned only those groups that had a count of more than 15.

Join clause

Join clause is a very interesting and important SQL clause. It retrieves and combines data from multiple tables on the same query based on a common attribute:

SELECT column1, column2, … FROM table1


ON table1.column_name= table2.column_name;

In our database, we can retrieve data from the centers and train tables since they share the common attribute center_id:

The INNER JOIN clause combines the two tables, train and centers, on the common attribute center_id specified by the statement train.center_id = centers.center_id. This means records having the same center_id in both the columns will concatenate horizontally.

This way we were able to retrieve the center_type, from centers, and the corresponding total number of orders from the train table. The . operator is very important here as it lets the database know which table the column belongs to.

If you want to know more about joins, I suggest going through this excellent article.

Update statement

Now, let’s say there was a glitch in the system and the base price for all the orders was saved as 10 more than the actual amount. We want to make that update in the database as soon as we found the mistake.

In such a situation, we will use the UPDATE SQL command.

The UPDATE command is used to modify existing records in a table. However, always make you sure you provide which records need to be updated in the WHERE clause otherwise all the records will be updated!

UPDATE table_name

SET column1 = value1, column2 = value2, …

WHERE condition;

Let’s have a look at the table before the update:

Decrease all the base prices by 10 for orders containing meals that had an email promotion:

View the code on Gist.

Finally, here’s a look at the updated table:

All the records have been correctly updated!

Delete statement

Now, suppose center number 11 no longer wants to continue to do business with the company. We have to delete their records from the database. We use the DELETE statement. However, make sure to use the WHERE clause otherwise all the records will be deleted from the table!

DELETE FROM table_name

WHERE condition;

Perfect! We no longer have any records corresponding to center 11.

Drop Table statement

Finally, if we had to drop an entire table from the database and not just its records, we use the DROP TABLE statement. But be extra careful before you run this command because all the records in the table along with the table structure will be lost after this!

Drop table table_name;

End Notes

SQL is a super important language to learn as an analyst or a data science professional. And it’s not a difficult language to pick up, as we’ve seen in this article.

We saw how to create a table in SQL and how to add values to it. We covered some of the most basic and heavily used querying commands like where, having, group by, order by, and joins to retrieve records in the database. Finally, we covered some manipulation commands like update and delete to make changes to tables in the database.

This is by no means an exhaustive guide on SQL and I suggest going through the below great resources to build upon the knowledge that you gathered in this article:


Update the detailed information about Guidelines For Creating And Using Cte With Synttax on the website. We hope the article's content will meet your needs, and we will regularly update the information to provide you with the fastest and most accurate information. Have a great day!