SQL (Structured Query Language) is a standard programming language specifically designed for managing and manipulating relational databases. It allows users to query, update, insert, and delete data within a database. SQL also supports various functions for sorting, filtering, and aggregating data to facilitate analysis and reporting.
A primary key is a column or a combination of columns in a table that uniquely identifies each row in that table. It must contain unique values and cannot contain NULL values. The primary key ensures entity integrity by ensuring that each record is unique and can be accurately identified.
A foreign key is a column or a set of columns in one table that uniquely identifies rows in another table. It creates a relationship between two tables, ensuring referential integrity. The foreign key values must match values in the referenced table’s primary key or be NULL, enforcing a valid link between related records.
A join is a SQL operation that combines rows from two or more tables based on a related column. The main types of joins are INNER JOIN (returns rows with matching values in both tables), LEFT JOIN (returns all rows from the left table and matched rows from the right table), RIGHT JOIN (returns all rows from the right table and matched rows from the left table), and FULL JOIN (returns rows when there is a match in one of the tables).
In SQL, both the WHERE and HAVING clauses are used to filter data, but they operate at different stages of the query execution process.
Aspect | WHERE Clause | HAVING Clause |
---|---|---|
Usage | Filters rows before grouping. | Filters grouped rows after the GROUP BY clause. |
Application | Used with SELECT, UPDATE, and DELETE statements. | Used exclusively with SELECT statements that have GROUP BY clauses. |
Aggregation | Cannot be used with aggregate functions like SUM(), AVG(), COUNT(), etc. | Typically used with aggregate functions to filter grouped data based on specified conditions. |
Syntax | WHERE condition | HAVING condition |
Evaluation order | Evaluated before grouping and aggregation functions. | Evaluated after grouping and aggregation functions. |
Impact on result | Can filter individual rows of data. | Filters aggregated data based on specified conditions. |
Efficiency | Generally more efficient because it reduces the number of rows processed before aggregation. | Less efficient compared to WHERE clause as it operates on aggregated data, but necessary for filtering aggregated results. |
An index is a database object that improves the speed of data retrieval operations on a table at the cost of additional storage space. Indexes can be created on one or more columns of a table to allow quick lookup of data. Common types include B-tree, bitmap, and unique indexes.
A subquery is a query nested within another query, typically within a SELECT, INSERT, UPDATE, or DELETE statement. Subqueries can be used to perform operations that depend on the result of another query. They can return single values (scalar subqueries), multiple values (row subqueries), or even a set of rows and columns (table subqueries).
Aggregate functions perform a calculation on a set of values and return a single value. Common aggregate functions include COUNT (counts rows), SUM (calculates the total), AVG (calculates the average), MAX (finds the maximum value), and MIN (finds the minimum value). These functions are often used with the GROUP BY clause.
DELETE and TRUNCATE are both SQL commands used to remove data from a table, but they differ in their functionality and how they operate.
Aspect | DELETE | TRUNCATE |
---|---|---|
Operation | Removes specific rows from a table based on a condition or all rows if no condition is specified. | Removes all rows from a table. |
Logging | Logged operation, each deleted row is logged in the transaction log and can be rolled back. | Non-logged operation, it doesn't log each deleted row, and therefore, it cannot be rolled back. |
Transaction | Can be a part of a transaction and can be rolled back using ROLLBACK command. | Cannot be rolled back within a transaction. |
Locking | Holds row-level locks during the operation, potentially causing blocking in concurrent environments. | Acquires a table-level lock for the duration of the operation, reducing contention in concurrent environments. |
Performance | Typically slower than TRUNCATE due to logging and individual row processing. | Generally faster than DELETE as it doesn't log each deletion and deallocates data pages directly. |
Triggers and Constraints | Executes triggers and respects constraints (e.g., foreign key constraints) associated with the table. | Doesn't execute triggers and constraints aren't checked. It is a faster operation but bypasses such validations. |
Usage | Used when selective deletion of rows is required or when triggers and constraints need to be honored. | Used when you need to quickly remove all rows from a table without logging each individual deletion. |
A view is a virtual table based on the result set of a SQL query. It can encapsulate complex queries, simplify data access, and provide an additional layer of security. Views do not store data physically but display data stored in underlying tables. They can be used to restrict data access or present data in a specific format.
A stored procedure is a prepared SQL code that can be saved and reused. It encapsulates a sequence of SQL statements and logic, allowing for modular, reusable, and maintainable database code. Stored procedures can accept parameters, execute complex operations, and return results to the caller.
A transaction is a sequence of one or more SQL operations executed as a single unit of work. Transactions ensure data integrity by adhering to ACID properties. They can be explicitly started with BEGIN TRANSACTION and ended with COMMIT to save changes or ROLLBACK to discard changes if an error occurs.
Constraints are rules enforced on data columns in a table to ensure data integrity and consistency. Common constraints include PRIMARY KEY (ensures unique identification), FOREIGN KEY (enforces referential integrity), UNIQUE (ensures all values in a column are unique), NOT NULL (ensures no null values), and CHECK (ensures values meet a specified condition).
A cursor is a database object used to retrieve, manipulate, and navigate through a result set row-by-row. Cursors are useful for operations that cannot be performed in a set-based manner. They allow for more granular control over data processing but can be resource-intensive and slower compared to set-based operations.
SQL (Structured Query Language) and NoSQL (Not Only SQL) are both database management systems, but they differ in several key aspects:
Aspect | SQL | NoSQL |
---|---|---|
Data Structure | Relational databases with structured schemas, tables, and rows. | Non-relational databases with flexible schemas, including key-value pairs, documents, wide-column stores, or graph databases. |
Scalability | Vertically scalable, meaning you can increase capacity by adding more power (CPU, RAM, SSD) to a single server. | Horizontally scalable, meaning you can handle larger data volumes and user loads by adding more servers to your NoSQL database cluster. |
Schema Flexibility | Requires a predefined schema with a fixed structure. Changes to schema may require downtime and migration. | Offers schema flexibility, allowing changes to the structure without downtime or migrations. |
Query Language | Standardized query language (SQL) for data manipulation and retrieval. | NoSQL databases use their own query languages, which can vary between different databases. |
ACID Transactions | Generally support ACID transactions (Atomicity, Consistency, Isolation, Durability). | Support for ACID transactions varies among NoSQL databases, with some offering full ACID compliance and others providing eventual consistency. |
Data Consistency | Emphasizes strong consistency and ensures that each transaction is consistent and isolated from others. | May prioritize availability and partition tolerance over consistency (CAP theorem), offering eventual consistency or other consistency models. |
Use Cases | Well-suited for applications requiring complex queries and transactions, such as financial systems or traditional relational databases. | Ideal for applications with large volumes of unstructured or semi-structured data, real-time analytics, content management systems, and distributed systems. |
Examples | MySQL, PostgreSQL, Oracle, SQL Server | MongoDB, Cassandra, Redis, Couchbase |
A composite key is a primary key that consists of two or more columns, used together to uniquely identify a row in a table. Composite keys are useful when a single column is not sufficient to guarantee uniqueness. They ensure that each combination of values in the specified columns is unique across the table.
A self-join is a join where a table is joined with itself. This is useful for hierarchical data or when comparing rows within the same table. In a self-join, table aliases are used to distinguish the table instances involved in the join, allowing for the comparison of columns from different rows within the same table.
CHAR and VARCHAR are both data types used to store character strings in SQL databases, but they have some differences:
Aspect | CHAR | VARCHAR |
---|---|---|
Storage | Fixed-length string. | Variable-length string. |
Length | Must be specified when defining the column. | Maximum length must be specified when defining the column. |
Trailing spaces | Trailing spaces are padded to fill the entire length of the column. | Trailing spaces are not padded. |
Storage space | Uses storage space equal to the specified length, regardless of actual data length. | Uses storage space equal to the actual data length plus two bytes to store the length of the data. |
Performance | Generally faster for retrieval and comparison operations, especially for fixed-length data. | May be more efficient in terms of storage space, especially for columns with varying lengths. |
A unique key is a constraint that ensures all values in a column or a combination of columns are unique across the table. Unlike a primary key, a table can have multiple unique keys, and unique keys can contain NULL values (though only one NULL per column). Unique keys prevent duplicate entries while allowing NULLs.
An index scan is a database operation where the entire index is scanned to find matching rows. It is less efficient than an index seek, which directly navigates to matching rows using the index. Index scans occur when the query optimizer determines that scanning the entire index is more efficient than using a specific seek.
Referential integrity is a database concept that ensures that relationships between tables remain consistent. When a foreign key in one table matches a primary key in another, referential integrity ensures that the foreign key value must always point to a valid, existing primary key value or be NULL. This prevents orphaned records and maintains logical relationships between tables.
DROP, TRUNCATE, and DELETE are all SQL commands used to remove data or objects from a database:
Aspect | DROP | TRUNCATE | DELETE |
---|---|---|---|
Operation | Removes an entire table, view, index, or database. | Removes all rows from a table. | Removes specific rows from a table based on a condition. |
Logging | Not logged. | Non-logged operation. | Logged operation. |
Transaction | Cannot be rolled back within a transaction. | Cannot be rolled back within a transaction. | Can be rolled back within a transaction. |
Triggers/Constraints | Drops associated triggers, constraints, and indexes. | Doesn't execute triggers, but constraints are checked. | Executes triggers and respects constraints. |
Performance | Very fast, but irreversible. | Very fast, but irreversible. | Slower compared to TRUNCATE, but faster than DROP. |
A common table expression (CTE) is a temporary result set defined within the execution scope of a single SELECT, INSERT, UPDATE, or DELETE statement. CTEs improve query readability and organization by breaking complex queries into simpler parts. They are defined using the WITH keyword and can be referenced like a table in subsequent queries.
The SQL SELECT statement is used to query and retrieve data from one or more tables. It allows specifying columns to retrieve, filtering rows with conditions, sorting results, and performing various transformations like aggregations and joins. SELECT is the most commonly used SQL statement for data retrieval and analysis.
A clustered index determines the physical order of data in a table and is created on the primary key by default. Each table can have only one clustered index, which makes data retrieval fast by storing rows in a sorted manner. Non-clustered indexes, in contrast, store pointers to the data rather than the data itself.
Some Key differences between clustered and non-clustered indexes:
Aspect | Clustered Index | Non-Clustered Index |
---|---|---|
Definition | Determines the physical order of data rows in the table. | Does not alter the physical order of data rows in the table. |
Structure | The actual table rows are stored in the order defined by the index. | Index contains a separate list of keys and pointers to the actual rows. |
Key | Only one clustered index can be created per table. | Multiple non-clustered indexes can be created per table. |
Searching | Generally faster for retrieving large ranges of data. | Typically faster for seeking individual rows. |
Storage | Requires additional storage space as it directly impacts the order of data rows. | Requires less storage space as it only contains the keys and pointers. |
Impact on INSERT/UPDATE/DELETE | Slower for INSERT, UPDATE, and DELETE operations as they may need to rearrange the data to maintain the index order. | Generally faster for INSERT, UPDATE, and DELETE operations as they don't require rearranging the data. |
Suitable Use Cases | Ideal for columns that are frequently used for range queries and have a unique key. | Suitable for columns often used in search operations where the order of data retrieval is not crucial. |
Examples | Primary key column is typically automatically indexed as clustered. | Indexes created manually on columns to improve query performance. |
A database schema is a logical container for database objects such as tables, views, indexes, and procedures. It defines how data is organized and structured within the database. Schemas help to group and manage database objects logically, allowing for better organization, access control, and separation of different application areas.
A correlated subquery is a subquery that references columns from the outer query, making it dependent on the outer query for its values. It is evaluated once for each row processed by the outer query. Correlated subqueries are useful for row-by-row comparisons but can be less efficient than non-correlated subqueries.
INNER JOIN and OUTER JOIN are both types of join operations used in SQL to combine rows from two or more tables based on a related column between them.
Aspect | INNER JOIN | OUTER JOIN |
---|---|---|
Matching Rows | Returns only matching rows from both tables. | Returns matching rows as well as unmatched rows. |
Unmatched Rows | Excludes unmatched rows from either table. | Includes unmatched rows with NULL values. |
Types | Only one type (INNER JOIN). | Three types (LEFT OUTER JOIN, RIGHT OUTER JOIN, FULL OUTER JOIN). |
Syntax | SELECT columns FROM table1 INNER JOIN table2 ON join_condition; | SELECT columns FROM table1 LEFT/RIGHT/FULL OUTER JOIN table2 ON join_condition; |
Usage | Used when only matching rows are needed. | Used when all rows from one or both tables are needed, with NULL values for unmatched rows. |
SQL injection is a security vulnerability where an attacker can insert malicious SQL code into an input field, potentially gaining unauthorized access to the database or executing harmful commands. To prevent SQL injection, use parameterized queries, stored procedures, and input validation to ensure only expected data is processed.
The GROUP BY clause groups rows sharing a common value into summary rows, often used with aggregate functions like COUNT, SUM, AVG, MAX, and MIN. It allows for data aggregation and analysis by grouping data based on one or more columns, providing insights into patterns and trends within the dataset.
A derived table is a subquery used within the FROM clause of a SELECT statement, treated as a temporary table. Derived tables help simplify complex queries by breaking them into smaller, more manageable parts. They can improve query readability and performance by allowing intermediate results to be processed within a single query.
A composite index is an index on multiple columns in a table. It improves query performance by allowing efficient access based on combinations of the indexed columns. Composite indexes are useful for queries that filter or sort data using multiple columns. The order of columns in the index is crucial for its effectiveness.
RANK() and DENSE_RANK() are both window functions in SQL used to assign ranks to rows in a result set based on a specified ordering. However, they differ in how they handle ties (i.e., rows with the same value being ranked).
Aspect | RANK() | DENSE_RANK() |
---|---|---|
Handling Ties | Leaves gaps in rank values when there are ties. | Does not leave gaps in rank values; assigns consecutive ranks to ties. |
Example | If two rows are tied for the second rank, the next rank will be 4 (no rank 3). | If two rows are tied for the second rank, the next rank will still be 3 (no gaps). |
Result | May produce non-consecutive ranks if there are ties. | Produces consecutive ranks even if there are ties. |
Syntax | RANK() OVER (PARTITION BY partition_expression ORDER BY order_expression) | DENSE_RANK() OVER (PARTITION BY partition_expression ORDER BY order_expression) |
COALESCE() returns the first non-NULL value in a list of expressions. It is used to handle NULL values and provide default values when necessary. COALESCE() is particularly useful in scenarios where NULL values need to be replaced with a meaningful alternative, ensuring that calculations and concatenations do not result in NULL.
A trigger is a special kind of stored procedure that automatically executes in response to certain events on a table or view, such as INSERT, UPDATE, or DELETE. Triggers are used to enforce business rules, maintain audit trails, and ensure data integrity by performing predefined actions whenever specified changes occur.
ACID stands for Atomicity, Consistency, Isolation, and Durability. These are properties that ensure reliable transaction processing in databases. Atomicity ensures that transactions are all-or-nothing, Consistency ensures that transactions bring the database from one valid state to another, Isolation ensures that transactions do not interfere with each other, and Durability ensures that committed transactions are permanently saved.
EXISTS and IN are both SQL operators used to test for the existence of rows in a subquery:
Aspect | EXISTS | IN |
---|---|---|
Usage | Typically used with correlated subqueries. | Used with a list of values or a subquery. |
Evaluation | Stops evaluation after finding the first matching row. | Evaluates all values or rows in the list or subquery. |
Performance | Generally more efficient for large result sets because it stops evaluation after finding a match. | May be less efficient for large result sets because it evaluates all values or rows. |
Null Handling | Handles NULL values explicitly or implicitly based on the subquery. | NULL values in the list or subquery are treated as if they don't exist. |
Subquery Columns | Doesn't require selecting specific columns in the subquery. | Requires selecting specific columns in the subquery to match with the main query. |
Normalization is a database design technique used to minimize redundancy and dependency by organizing fields and table relationships. The goal is to divide a database into smaller, related tables and define relationships between them, thereby improving data integrity and reducing data anomalies. Normalization typically involves applying normal forms (1NF, 2NF, 3NF, etc.).
Denormalization is the process of combining normalized tables into larger tables to improve database performance, typically read performance. While it introduces redundancy and can lead to anomalies, it reduces the number of joins needed to retrieve data, making queries faster in read-heavy applications.
A primary key and a unique key are both constraints used in SQL to enforce uniqueness in a column or combination of columns in a table:
Aspect | Primary Key | Unique Key |
---|---|---|
Uniqueness | Uniquely identifies each row in a table. | Ensures uniqueness but allows NULL values. |
Nullability | Does not allow NULL values. | Allows NULL values (except in Oracle). |
Number | Only one primary key constraint per table. | Multiple unique key constraints per table. |
Purpose | Typically used as a unique identifier for each row. | Used to enforce uniqueness, but with more flexibility regarding NULL values. |
Implicit NOT NULL | Imposes the NOT NULL constraint automatically. | Does not implicitly enforce the NOT NULL constraint. |
Index Creation | Automatically creates a clustered or non-clustered index. | Creates a non-clustered index by default. |
Relationship | Often used as a foreign key in related tables. | Can also be used as a reference for foreign keys. |
OLAP (Online Analytical Processing) databases are designed for complex queries and analysis of large volumes of data, typically used in data warehousing and business intelligence. OLAP databases support multi-dimensional data models, allowing for efficient data aggregation, slicing, dicing, and drilling down. They enable quick retrieval of summarized data for decision-making.
A materialized view is a database object that stores the result of a query physically, unlike a regular view that is a virtual table. Materialized views improve query performance by precomputing and storing the results, allowing for faster data retrieval. They are especially useful for complex queries and data warehousing applications.
The DELETE and DROP commands are both used in SQL, but they serve different purposes:
Aspect | DELETE Command | DROP Command |
---|---|---|
Purpose | Removes one or more rows from a table based on a specified condition. | Removes an entire object (e.g., table, index, view) from the database. |
Operation | Deletes data from a table. | Removes the object itself. |
Rollback | Can be rolled back using a transaction if within a transaction block. | Cannot be rolled back; the action is permanent. |
Syntax | DELETE FROM table_name WHERE condition; | DROP object_type object_name; |
Example (Table) | DELETE FROM table_name WHERE condition; | DROP TABLE table_name; |
Example (Index) | DELETE FROM table_name WHERE condition; | DROP INDEX index_name; |
Example (View) | DELETE FROM table_name WHERE condition; | DROP VIEW view_name; |
A surrogate key is an artificial primary key, typically an integer, used to uniquely identify each row in a table. Unlike natural keys, surrogate keys have no business meaning and are not derived from application data. They are used to simplify key management and ensure uniqueness, especially in large databases with complex relationships.
A cross join and a full join are both types of joins used in SQL, but they serve different purposes and produce different result sets:
Aspect | Cross Join | Full Join |
---|---|---|
Result Set | Cartesian product of rows from both tables. | All rows from both tables, with NULL values where no match is found. |
Matching Rows | No condition is used to match rows. | Matches rows based on a specified condition. |
Syntax | SELECT * FROM table1 CROSS JOIN table2; | SELECT * FROM table1 FULL OUTER JOIN table2 ON join_condition; |
Usage | Rarely used due to its potential for producing a large result set. | Used when you want to combine data from two tables while preserving unmatched rows from both tables. |
Example | SELECT * FROM table1 CROSS JOIN table2; | SELECT * FROM table1 FULL OUTER JOIN table2 ON table1.column = table2.column; |
The CASE statement allows for conditional logic in SQL queries, similar to IF-THEN-ELSE statements in programming languages. It evaluates a list of conditions and returns a corresponding result for the first true condition. CASE is useful for creating calculated fields, transforming data, and implementing complex conditional logic within a query.
The UNION and JOIN operations are both fundamental concepts in SQL, but they serve different purposes:
Aspect | UNION | JOIN |
---|---|---|
Purpose | Combines the results of two or more SELECT statements into a single result set. | Combines rows from two or more tables based on a related column. |
Operation | Vertically concatenates rows from different SELECT statements. | Horizontally combines rows from different tables based on a specified condition. |
Matching Rows | Does not involve matching rows based on a condition. | Selects rows from the tables that have matching values in the specified column(s). |
Result Set | Concatenates rows from different SELECT statements, optionally removing duplicates. | Combines rows from different tables based on a specified condition. |
Syntax | SELECT column1 FROM table1 UNION SELECT column2 FROM table2; | SELECT * FROM table1 INNER JOIN table2 ON table1.column = table2.column; |
Efficiency | Typically involves less processing compared to joins. | May involve more processing, especially for complex join conditions and large tables. |
The LIKE operator is used for pattern matching in string searches, allowing for partial matches using wildcard characters. The percent sign (%) represents zero or more characters, and the underscore (_) represents a single character. LIKE is useful for searching strings with unknown or variable characters, such as finding names with specific patterns.
To find the duplicates in a table, first create a table with duplicate rows.
DROP TABLE IF EXISTS MEMBERS;
CREATE TABLE MEMBERS (
id serial PRIMARY KEY,
name TEXT
);
INSERT INTO MEMBERS (name)
VALUES
('Praveen'),
('Mahesh'),
('Vijay'),
('Nitin'),
('Deepak'),
('Manish'),
('Neel'),
('Ravi'),
('Diya'),
('Naveen'),
('Dinesh'),
('Raj')
('Mahesh'),
('Vijay'),
('Nitin'),
('Deepak');
Use a select * statement to see all of our entries.
SELECT * FROM MEMBERS;
Use the COUNT() function to find all duplicate rows.
SELECT
-- Get the column.
name,
-- Count how many times this name occurs.
count(*)
FROM
MEMBERS
GROUP BY
-- Using an aggregate function forces us to group all like name together.
NAME
HAVING
-- Only select values that have a count greater than one (multiple entries).
count(*) > 1;
The results show all the names that appear more than once and their count.
To delete multiple entries from a table in SQL Server, you can use the DELETE statement with the IN operator. The IN operator allows you to specify a list of values that you want to delete. For example, the following query will delete all rows from the Customers table where the Country column is equal to "USA":
DELETE FROM Customers WHERE Country IN ("USA");
You can also use the BETWEEN operator to delete a range of rows. For example, the following query will delete all rows from the Customers table where the Age column is between 20 and 30:
DELETE FROM Customers WHERE Age BETWEEN 20 AND 30;
Finally, you can also use the WHERE clause to specify more complex criteria for deleting rows. For example, the following query will delete all rows from the Customers table where the Country column is equal to "USA" and the Age column is greater than 25:
DELETE FROM Customers WHERE Country = "USA" AND Age > 25;
It is important to note that the DELETE statement will not return any rows. The rows that are deleted are simply removed from the table.
Here are some additional tips for deleting multiple entries from a table in SQL Server:
The UNION and UNION ALL operators in SQL are used to combine the results of two or more SELECT statements. The main difference between the two operators is that UNION removes duplicate rows from the results, while UNION ALL does not.
Here is an example of a UNION query:
SELECT * FROM Customers
UNION
SELECT * FROM Suppliers;
This query will return all rows from the Customers table and the Suppliers table, but it will only return each row once.
Here is an example of a UNION ALL query:
SELECT * FROM Customers
UNION ALL
SELECT * FROM Suppliers;
This query will return all rows from the Customers table and the Suppliers table, including any duplicate rows.
In general, you should use UNION when you want to remove duplicate rows from the results. You should use UNION ALL when you want to keep all rows, including duplicates.
Operator | Description |
---|---|
UNION | Removes duplicate rows from the results. |
UNION ALL | Keeps all rows, including duplicates. |
Here are some additional things to keep in mind when using UNION and UNION ALL:
The RANK, ROW_NUMBER, and DENSE_RANK functions in SQL are all window functions that are used to rank rows within a partition. The main difference between the three functions is how they handle ties.
Function | Description |
---|---|
RANK | Assigns the same rank to all rows with the same value. |
ROW_NUMBER | Assigns a unique rank to each row, regardless of the values in the other rows. |
DENSE_RANK | Assigns consecutive ranks to rows, even if there are duplicate values. |
In general, you should use the RANK function when you want to identify the relative rank of each row, even if there are duplicate values. You should use the ROW_NUMBER function when you want to assign a unique rank to each row, regardless of the values in the other rows. You should use the DENSE_RANK function when you want to assign consecutive ranks to rows, even if there are duplicate values.
Here are some additional things to keep in mind when using RANK, ROW_NUMBER, and DENSE_RANK:
SELECT * FROM Customers
WHERE Country NOT IN ("USA");
SELECT * FROM Customers
WHERE NOT EXISTS (SELECT * FROM Orders WHERE Customers.CustomerID = Orders.CustomerID );
SELECT * FROM Customers
EXCEPT
SELECT * FROM Orders;
The method that you choose will depend on the specific requirements of your query.
Referential integrity is a property of relational databases that ensures consistency and accuracy of data by enforcing relationships between tables. It ensures that foreign key values in a child table match primary key values in a parent table, preventing orphaned records and maintaining valid links between related data.
The MERGE and UPDATE statements are both SQL commands used to modify data in a table, but they have different functionalities:
Aspect | MERGE Statement | UPDATE Statement |
---|---|---|
Operation | Performs a combination of INSERT, UPDATE, and DELETE operations in a single statement. | Modifies existing data in a table based on a specified condition. |
Matching | Matches rows based on a specified condition for both source and target tables. | Matches rows based on a specified condition in the target table only. |
Actions | Allows specifying different actions for matched, unmatched, and duplicate rows. | Updates existing rows that match the specified condition. |
Handling | Handles insertions, updates, and deletions in one statement based on the specified conditions. | Focuses solely on updating existing rows based on the specified condition. |
Use Cases | Useful for synchronizing data between two tables or updating a target table based on conditions from a source table. | Used when you know exactly which rows you want to update and what values you want to set for those rows. |
The DISTINCT keyword removes duplicate rows from the result set of a SELECT query. It ensures that each row in the output is unique, useful for generating reports, summaries, and eliminating redundant data. DISTINCT applies to the entire row, not just a single column, ensuring overall uniqueness in the result.
An execution plan is a detailed map of how the database engine executes a query, including the order of operations, indexes used, and join methods. It helps in understanding and optimizing query performance by revealing potential bottlenecks and inefficiencies. Execution plans can be viewed using tools like EXPLAIN or graphical query analyzers.
To check FullText Search is installed, run below query
Select SERVERPROPERTY('IsFullTextInstalled')
We can use an SQL query involving a self-join on the employee's table. Here's an example query for the 'employees' table with the following data:
id | name | salary | manager_id |
---|---|---|---|
1 | Guddu | 100000 | NULL |
2 | Prince | 80000 | 1 |
3 | Baibhav | 120000 | 1 |
4 | Saurav | 90000 | 2 |
5 | Gaurav | 110000 | 2 |
Example SQL query:
SELECT e1.id AS employee_id, e1.name AS employee_name, e1.salary AS employee_salary,
e2.id AS manager_id, e2.name AS manager_name, e2.salary AS manager_salary
FROM employees e1
JOIN employees e2 ON e1.manager_id = e2.id
WHERE e1.salary > e2.salary;
Some key differences between an inner and left join:
Aspect | Inner Join | Left Join |
---|---|---|
Description | Returns only the rows that have matching values | Returns all rows from the left table and matching rows from the right table, where available |
Syntax | SELECT FROM table1 INNER JOIN table2 ON condition | SELECT FROM table1 LEFT JOIN table2 ON condition |
Usage | Use when you only want matching rows | Use when you want all rows from the left table, and matching rows from the right table, where available |
Result | Includes only matching rows | Includes all rows from the left table; for non-matching rows from the right table, NULL values are used |
To update a table and swap gender values, we can use an SQL UPDATE statement. For example we have a table named 'employees' with columns 'id', 'name', and 'gender', and we want to swap the gender values for all rows in the table:
UPDATE employees
SET gender = CASE
WHEN gender = 'Male' THEN 'Female'
WHEN gender = 'Female' THEN 'Male'
ELSE gender -- Handle any other gender values, if applicable
END;
Some key differences between the DELETE, TRUNCATE and DROP statements:
Feature | DELETE | TRUNCATE | DROP |
---|---|---|---|
Purpose | Removes specific rows from a table | Removes all rows from a table | Deletes an entire table and its structure |
Functionality | Deletes rows based on a condition | Deletes all rows from the table | Deletes the entire table and its associated objects |
Impact | Affects rows in the table without altering its structure | Removes all data while keeping the table structure | Removes the table along with its structure and associated objects |
Rollback | Can be rolled back within a transaction | Cannot be rolled back (DDL operation) | Cannot be rolled back (DDL operation) |
Performance | Slower compared to TRUNCATE | Faster compared to DELETE | N/A |
Usage | Used to selectively remove specific rows | Used to quickly remove all rows from a table | Used to completely remove a table from the database |
Some key differences between the NOW() and CURRENT_DATE functions:
Feature | NOW() | CURRENT_DATE |
---|---|---|
Purpose | Returns the current date and time | Returns the current date (without time) |
Data Type | Returns a timestamp with time zone | Returns a date (without time) |
Time Zone | May include the time zone depending on the database system configuration | Does not include the time zone |
Usage | Used when you need the current date and time, including the time zone information if needed | Used when you only need the current date without the time |
Some key differences between the WHERE and the HAVING clause:
Feature | WHERE | HAVING |
---|---|---|
Purpose | Filters rows before groups are formed | Filters groups after groups are formed |
Applied to | Applies to individual rows in the result set | Applies to groups defined by GROUP BY clause |
Order of Execution | Applied before GROUP BY | Applied after GROUP BY |
Aggregates | Cannot contain aggregate functions | Can contain aggregate functions |
Usage | Used to filter rows based on individual row conditions | Used to filter groups based on aggregated conditions |
Example | SELECT * FROM table WHERE condition; | SELECT column1, aggregate_function(column2) FROM table GROUP BY column1 HAVING condition; |
To select only the even-numbered rows from a table of names, you can use the ROW_NUMBER() window function to assign a sequential number to each row and then filter for rows with even row numbers. Here's how you can write the query:
SELECT name
FROM (
SELECT name, ROW_NUMBER() OVER (ORDER BY name) AS row_num
FROM your_table_name
) AS numbered_rows
WHERE row_num % 2 = 0;
To copy the contents of one table to a new table, we can use the CREATE TABLE AS SELECT (CTAS) statement or the INSERT INTO SELECT statement, whether the new table already exists or not.
If you want to create a new table and copy the contents of an existing table into it, you can use the CREATE TABLE AS SELECT statement:
CREATE TABLE new_table_name AS
SELECT *
FROM existing_table_name;
If the new table already exists and you want to insert data into it, you can use the INSERT INTO SELECT statement:
INSERT INTO new_table_name
SELECT *
FROM existing_table_name;
Some key differences between LIKE and ILIKE:
Feature | LIKE | ILIKE |
---|---|---|
Case Sensitivity | Case-sensitive (depends on database settings) | Case-insensitive |
Usage | Used for pattern matching where case matters | Used for pattern matching where case doesn't matter |
Syntax | column_name LIKE 'pattern' | column_name ILIKE 'pattern' |
Some key differences between Aggregate and Scalar functions:
Feature | Aggregate Functions | Scalar Functions |
---|---|---|
Purpose | Operate on sets of rows and return a single result, summarizing data | Operate on individual values and return a single result based on each input value |
Usage | Typically used with the GROUP BY clause to perform calculations on groups of rows | Used within SQL expressions to manipulate or transform data on a row-by-row basis |
Examples | SUM(), AVG(), COUNT(), MAX(), MIN() | UPPER(), LOWER(), SUBSTRING(), DATE_FORMAT() |
SELECT department, COUNT(*) AS num_employees
FROM employees
GROUP BY department;
SELECT name, UPPER(name) AS uppercase_name
FROM employees;
Finding the middle value, or median, of numbers in an SQL database can be tricky, especially because some databases don't come with a ready-made median function. But you can still figure it out by using different SQL functions and methods.
WITH Ranked AS (
SELECT numerical_field,
ROW_NUMBER() OVER (ORDER BY numerical_field) AS RowAsc,
ROW_NUMBER() OVER (ORDER BY numerical_field DESC) AS RowDesc
FROM your_table
)
SELECT AVG(numerical_field) AS Median
FROM (
SELECT numerical_field
FROM Ranked
WHERE RowAsc IN ((SELECT COUNT(*) FROM Ranked) / 2, (SELECT COUNT(*) FROM Ranked) / 2 + 1)
OR RowDesc IN ((SELECT COUNT(*) FROM Ranked) / 2, (SELECT COUNT(*) FROM Ranked) / 2 + 1)
) AS SubQuery;
This question will require us to rank salaries and partition that ranking by the department of each individual employee.
I will add a manager id column to be used in the next question.
DROP TABLE IF EXISTS employee;
CREATE TABLE employee (
emp_id int,
emp_name TEXT,
manager_id int,
department TEXT,
salary int
);
INSERT INTO employee (emp_id, emp_name, manager_id, department, salary)
VALUES
(1, 'jaime', 0, 'IT', 85000),
(2, 'robert', 1, 'IT', 75000),
(3, 'lisa', 1, 'IT', 65000),
(4, 'chris', 1, 'IT', 55000),
(5, 'mary', 7, 'SALES', 55000),
(6, 'richard', 7, 'SALES', 85000),
(7, 'jane', 0, 'SALES', 80000),
(8, 'trevor', 7, 'SALES', 65000),
(9, 'joan', 12, 'HR', 55000),
(10, 'jennifer', 12, 'HR', 71000),
(11, 'trish', 12, 'HR', 58000),
(12, 'marge', 0, 'HR', 70000);