ARTS Week 31

Algorithm

This week’s LeetCode problem is 380. Insert Delete GetRandom O(1)

Implement the RandomizedSet class:

  • RandomizedSet() Initializes the RandomizedSet object.
  • bool insert(int val) Inserts an item val into the set if not present. Returns true if the item was not present, false otherwise.
  • bool remove(int val) Removes an item val from the set if present. Returns true if the item was present, false otherwise.
  • int getRandom() Returns a random element from the current set of elements (it's guaranteed that at least one element exists when this method is called). Each element must have the same probability of being returned.

You must implement the functions of the class such that each function works in average O(1) time complexity.

In order to keep the average time complexity at O(1), a hash table is needed to store the mapping between values ​​and indexes. At the same time, an array is used as the data structure of the actual saved data, and there is a random number class to facilitate the generation of random numbers. The implementation methods of each method are described below:

  • Initialization method RandomizedSet(): Initialize the above-mentioned array, hash table and random number classes respectively
  • Insert method bool insert(int val): first determine whether val already exists, if so, return false, if not, insert val into the end of the array, and update the hash table map at the same time .put(val, list.size()-1);, finally return true
  • Deletion method bool remove(int val): first determine whether val already exists, if not, return false, if it exists, first obtain val through the hash table and index the index in the array, because it requires time The complexity is O(1), then the deletion method of the array cannot be used, but the value at the index position of the array is changed to the value of the last element of the array list.set(index, lastVal);, and updated at the same time. The mapping relationship in the hash table map.put(lastVal, index);, and then remove the last element of the array (this operation is O(1) complexity) and the mapping of val in the hash table, Finally returns true
  • Randomly returns an item int getRandom(): use the random class to randomly generate a number within the index range of the array, and return the number corresponding to the subscript.

Review

This week’s Review is for the following article: How slow is SELECT * ?

The most well-known query optimization rule is to avoid SELECT * as much as possible, and even if all columns are required, they should be listed by name. On the surface it looks like SELECT * will read unnecessary columns, but the deeper reasons are:

  • These columns are read from memory or, in the worst case, from disk. This means higher resource usage and higher latency.
  • If they’re read form disk they may be cached, if they’re cached they’ll stay in the cache for a longer time. This may be a waste of memory.
  • Columns will be sent over a network. This causes more work both on the database server and the application server.

In some cases SELECT * will seriously affect overall application performance:

  1. Columnar databases. many databases are columnar databases, which store different columns in different data structures, so that SUM(column) is faster, while SELECT * will be slower.
  2. Covering indexes. Comparing SELECT id, surname FROM user WHERE surname < 'c'; and SELECT * FROM user WHERE surname < 'c';, if there is an index on id, the former can be quickly read through the index The corresponding content in the line is compared, but the latter does not:
  3. Strength is in numbers. when there are too many columns in a table, the data in all columns cannot be read only by id
  4. Generate column/Virtual column. When the SQL statement is executed, it will calculate and generate virtual columns. In some cases, it will be written to disk for storage. In some cases, it will be calculated immediately. When it is calculated, it will bring greater overhead.
  5. Views. Views are built on JOIN operations, selecting only required columns may ignore some unnecessary tables, making queries faster
  6. InnoDB TEXTs and BLOBs. In InnoDB (the default storage engine for MariaDB and MySQL), large variable-length text and binary columns are stored in separate memory pages. When long data is stored in separate pages, reading it requires at least one physical read. This affects performance.

Finally, SELECT * is not a good practice, but its impact on query performance and server resource usage is often overstated. Usually other aspects are more important, namely using indexes and avoiding the use of internal temporary tables for two-step sorts.

Tip

I have just started working with shell scripts recently, and the multi-line comments in shell scripts are written:

Share

This week is the second week of quarantine, and it is expected to continue to be quarantined in the next few weeks. At present, I have basically adapted to such a life. I plan to take a break and stop writing for a while, to organize and think about everything again, and hope you to know.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
KeepNewbie_yan

KeepNewbie_yan

5 Followers

A programmer. Share knowledge of programming, operating system, Linux kernel, and reading, thinking etc. Let us maintain a beginner mend and grow together!