个人理解hashmap底层实现原理（一篇文章彻底读懂HashMap之HashMap源码解析）

逗爷 2023-03-19 07:34:06 537

个人理解hashmap底层实现原理（一篇文章彻底读懂HashMap之HashMap源码解析）//入口返回对应的value public V get(Object key) { Node<K V> e; //hash：这个函数上面分析过了。返回key混淆后的hashCode //注意getNode返回的类型是Node：当返回值为null时表示map中没有对应的key，注意区分value为 //null：如果key对应的value为null的话，体现在getNode的返回值e.value为null，此时返回值也是 //null，也就是HashMap的get函数不能判断map中是否有对应的key：get返回值为null时，可能不包 //含该key，也可能该key的value为null！那么如何判断map中是否包含某个key呢？见下面contains //函数分析 return (e = getNode(hash(key) key)) == null

个人理解hashmap底层实现原理（一篇文章彻底读懂HashMap之HashMap源码解析）(1)

就身边同学的经历来看，HashMap是求职面试中名副其实的“明星”，基本上每一加公司的面试多多少少都有问到HashMap的底层实现原理、源码等相关问题。

在秋招面试准备过程中，博主阅读过很多关于HashMap源码分析的文章，漫长的拼凑式阅读之后，博主没有看到过一篇能够通俗易懂、透彻讲解HashMap源码的文章（可能是博主没有找到）。秋招结束后，国庆假期抽空写下了这篇文章，用一种通俗易懂的方式向读者讲解HashMap源码，并且尽量涵盖面试中HashMap的所有考察点。希望能够对后面的求职者有所帮助~

这篇文章将会按以下顺序来组织：

HashMap源码分析（JDK8，通俗易懂）
HashMap面试“明星”问题汇总，以及明星问题答案

下面是JDK8中HashMap的源码分析，在下文源码分析中：

注释多少与重要性成正比

HashMap的成员属性源码分析

public class HashMap<K V> extends AbstractMap<K V> implements Map<K V> Cloneable Serializable { private static final long serialVersionUID = 362498820763181265L; /** * The default initial capacity - MUST be a power of two. */ //HashMap的初始容量为16，HashMap的容量指的是存储元素的数组大小，即桶的数量 static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16 /** * The maximum capacity used if a higher value is implicitly specified * by either of the constructors with arguments. * MUST be a power of two <= 1<<30. */ //HashMap的最大的容量 static final int MAXIMUM_CAPACITY = 1 << 30; /** * The load factor used when none specified in constructor. */ /** * DEFAULT_LOAD_FACTOR:HashMap的负载因子，影响HashMap性能的参数之一，是时间和空间之间的权 * 衡，后面会看到HashMap的元素存储在Node数组中，这个数组的大小这里称为“桶”的大小. * 另外还有一个参数size指的是我们往HashMap中put了多少个元素。 * 当size==桶的数量*DEFAULT_LOAD_FACTOR的时候，这时HashMap要进行扩容操作，也就是桶不能装 * 满。DEFAULT_LOAD_FACTOR是衡量桶的利用率:DEFAULT_LOAD_FACTOR较小时（桶的利用率较小） * 这时浪费的空间较多（因为只能存储桶的数量*DEFAULT_LOAD_FACTOR个元素，超过了就要进行扩容） * 这种情况下往HashMap中put元素时发生冲突的概率也很小，所谓冲突指的是：多个元素被put到了同一个桶中; * 冲突小时（可以认为一个桶中只有一个元素）put、get等HashMap的操作代价就很低，可以认为是O(1); * 桶的利用率较大的时候(DEFAULT_LOAD_FACTOR很大，注意可以大于1，因为冲突的元素是使用链表或者红黑树连接起来的) * 空间利用率较高， * 这也意味着一个桶中存储了很多元素，这时HashMap的put、get等操作代价就相对较大，因为每一个put或get操作都变成了 * 对链表或者红黑树的操作，代价肯定大于O(1) 所以说DEFAULT_LOAD_FACTOR是空间和时间的一个平衡点; * DEFAULT_LOAD_FACTOR较小时，需要的空间较大，但是put和get的代价较小; * DEFAULT_LOAD_FACTOR较大时，需要的空间较小，但是put和get的代价较大）。 * * 扩容操作就是把桶的数量*2，即把Node数组的大小调整为扩容前的2倍，至于为什么是两倍，分析扩容函数时会讲解， * 这其实是一个trick。Node数组中每一个桶中存储的是Node链表，当链表长度>=8的时候， * 链表会变为红黑树结构（因为红黑树的增删改查复杂度是logn，链表是n，红黑树结构比链表代价更小）。 */ static final float DEFAULT_LOAD_FACTOR = 0.75f; /** * The bin count threshold for using a tree rather than list for a * bin. Bins are converted to trees when adding an element to a * bin with at least this many nodes. The value must be greater * than 2 and should be at least 8 to mesh with assumptions in * tree removal about conversion back to plain bins upon * shrinkage. */ //当某一个桶中链表的长度>=8时，链表结构会转换成红黑树结构（其实还要求桶的中数量>=64 后面会提到） static final int TREEIFY_THRESHOLD = 8; /** * The bin count threshold for untreeifying a (split) bin during a * resize operation. Should be less than TREEIFY_THRESHOLD and at * most 6 to mesh with shrinkage detection under removal. */ //当红黑树中的节点数量<=6时，红黑树结构会转变为链表结构 static final int UNTREEIFY_THRESHOLD = 6; /** * The smallest table capacity for which bins may be treeified. * (Otherwise the table is resized if too many nodes in a bin.) * Should be at least 4 * TREEIFY_THRESHOLD to avoid conflicts * between resizing and treeification thresholds. */ //上面提到的：当Node数组容量>=64的前提下，如果某一个桶中链表长度>=8，则会将链表结构转换成 //红黑树结构 static final int MIN_TREEIFY_CAPACITY = 64; }

HashMap内部类——Node源码分析
下面介绍HashMap的内部类Node，HashMap的所有数据都保存在Node数组中那么这个Node到底是个什么东西呢？

//Node是HashMap的内部类 static class Node<K V> implements Map.Entry<K V> { //保存key的hashcode=key的hashcode ^ (key的hashcode>>>16)这样做主要是为了减少hash冲突 //当我们往map中put(k v)时，这个k v键值对会被封装为Node，那么这个Node放在Node数组的哪个 //位置呢：index=hash&(n-1) n为Node数组的长度。那为什么这样计算hash可以减少冲突呢？如果直接 //使用hashCode&(n-1)来计算index，此时hashCode的高位随机特性完全没有用到，因为n相对于 //hashcode的值很小。基于这一点，把hashcode 高16位的值通过异或混合到hashCode的低16位，由此 //来增强hashCode低16位的随机性 final int hash; final K key;//保存map中的key V value;//保存map中的value Node<K V> next;//单向链表 Node(int hash K key V value Node<K V> next) {//构造器 this.hash = hash; this.key = key; this.value = value; this.next = next; }

HashMap hash函数和tableSizeFor函数源码分析（这两个函数后面会用到）

/** * HashMap允许key为null，null的hash为0（也意味着HashMap允许key为null的键值对）， * 非null的key的hash高16位和低16位分别由由：key的hashCode * 高16位和hashCode的高16位异或hashCode的低16位组成。主要是为了增强hash的随机性减少hash&(n-1)的 * 随机性，即减小hash冲突，提高HashMap的性能。所以作为HashMap的key的hashCode函数的实现对HashMap * 的性能影响较大，极端情况下：所有key的hashCode都相同，这是HashMap的性能很糟糕！ */ static final int hash(Object key) { int h; return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16); } /** * 在new HashMap的时候，如果我们传入了大小参数，这是HashMap会对我们传入的HashMap容量进行传到 * tableSizeFor函数处理：这个函数主要功能是：返回一个数：这个数是大于等于cap并且是2的整数次幂 * 的所有数中最小的那个，即返回一个最接近cap(>=cap)，并且是2的整数次幂的那个数。 * 具体逻辑如下:一个数是2的整数次幂，那么这个数减1的二进制就是一串掩码，即二进制从某位开始是一串连续的1。 */ static final int tableSizeFor(int cap) { //举例而言：n的第三位是1(从高位开始数)， int n = cap - 1; n |= n >>> 1; n |= n >>> 2; n |= n >>> 4; n |= n >>> 8; n |= n >>> 16; //举例而言：如果n为: 00010000 00000000 00000000 000 /* n |= n >>> 1;->n：00011000 00000000 00000000 0000 n |= n >>> 2;->n: 00011110 00000000 00000000 0000 n |= n >>> 4;->n: 00011111 11100000 00000000 0000 n |= n >>> 8;->n: 00011111 11111111 11100000 0000 n |= n >>> 16;->n:00011111 11111111 11111111 1111 返回n 1：00010000 00000000 00000000 000(>=cap并且为2的整数次幂，与cap差值最小的那个) 最后的n 1一定是2的整数次幂，并且一定是>=cap 整体的思路就是：如果n二进制的第k为1，那么经过上面四个‘|’运算后[0 k]位都变成了1 即：一连串连续的二进制‘1’(掩码)，最后n 1一定是2的整数次幂（如果不溢出） */ return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n 1; }

HashMap成员属性源码分析

/** * The table initialized on first use and resized as * necessary. When allocated length is always a power of two. * (We also tolerate length zero in some operations to allow * bootstrapping mechanics that are currently not needed.) */ //我们往map中put的(k v)都被封装在Node中，所有的Node都存放在table数组中 transient Node<K V>[] table; /** * Holds cached entrySet(). Note that AbstractMap fields are used * for keySet() and values(). */ //用于返回keySet和values transient Set<Map.Entry<K V>> entrySet; /** * The number of key-value mappings contained in this map. */ //保存map当前有多少个元素 transient int size; /** * The number of times this HashMap has been structurally modified * Structural modifications are those that change the number of mappings in * the HashMap or otherwise modify its internal structure (e.g. * rehash). This field is used to make iterators on Collection-views of * the HashMap fail-fast. (See ConcurrentModificationException). */ //failFast机制，在讲解ArrayList和LinkedList一文中已经讲过了 transient int modCount; /** * The next size value at which to resize (capacity * load factor). * * @serial */ // (The javadoc description is true upon serialization. // Additionally if the table array has not been allocated this // field holds the initial array capacity or zero signifying // DEFAULT_INITIAL_CAPACITY.) //这也是比较重要的一个属性： //创建HashMap时，改变量的值是：初始容量（2的整数次幂） //之后threshold的值是HashMap扩容的门限值：即当前Node/table数组的长度*loadfactor //举个例子而言，如果我们传给HashMap构造器的容量大小为9，那么threshold初始值为16，在向HashMap中 //put第一个元素后，内部会创建长度为16的Node数组，并且threshold的值更新为16*0.75=12。 //具体而言，当我们一直往HashMap put元素的时候，，如果put某个元素后，Node数组中元素个数为13了 //，此时会触发扩容（因为数组中元素个数>threshold了，即13>threshold=12），具体扩容操作之后会 //详细分析，简单理解就是，扩容操作将Node数组长度*2；并且将原来的所有元素迁移到新的Node数组中。 int threshold; /** * The load factor for the hash table. * * @serial */ //负载因子，见上面对DEFAULT_LOAD_FACTOR 参数的讲解，默认值是0.75 final float loadFactor;

HashMap构造器源码分析

//构造器：指定map的大小，和loadfactor public HashMap(int initialCapacity float loadFactor) { if (initialCapacity < 0) throw new IllegalArgumentException("Illegal initial capacity: " initialCapacity); if (initialCapacity > MAXIMUM_CAPACITY) initialCapacity = MAXIMUM_CAPACITY; if (loadFactor <= 0 || Float.isNaN(loadFactor)) throw new IllegalArgumentException("Illegal load factor: " loadFactor); this.loadFactor = loadFactor;//保存loadfactor //注意，前面有讲tableSizeFor函数，该函数返回值：>=initialCapacity、返回值是2的整数次幂 //并且得是满足上面两个条件的所有数值中最小的那个数 this.threshold = tableSizeFor(initialCapacity); } //只指定HashMap容量的构造器，loadfactor使用的是默认的值:0.75 public HashMap(int initialCapacity) { this(initialCapacity DEFAULT_LOAD_FACTOR); } //无参构造器，默认loadfactor：0.75 public HashMap() { this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted } //其他不常用的构造器就不分析了

从构造器中我们可以看到：HashMap是“懒加载”，在构造器中值保留了相关保留的值，并没有初始化table<Node>数组，当我们向map中put第一个元素的时候，map才会进行初始化！

HashMap的get函数源码分析

//入口返回对应的value public V get(Object key) { Node<K V> e; //hash：这个函数上面分析过了。返回key混淆后的hashCode //注意getNode返回的类型是Node：当返回值为null时表示map中没有对应的key，注意区分value为 //null：如果key对应的value为null的话，体现在getNode的返回值e.value为null，此时返回值也是 //null，也就是HashMap的get函数不能判断map中是否有对应的key：get返回值为null时，可能不包 //含该key，也可能该key的value为null！那么如何判断map中是否包含某个key呢？见下面contains //函数分析 return (e = getNode(hash(key) key)) == null ? null : e.value; } public boolean containsKey(Object key) { //注意与get函数区分，我们往map中put的所有的<key value>都被封装在Node中，如果Node都不存在 //显然一定不包含对应的key return getNode(hash(key) key) != null; } //下面分析getNode函数 /** * Implements Map.get and related methods * * @param hash hash for key //下面简称：key的hash值 * @param key the key * @return the node or null if none */ final Node<K V> getNode(int hash Object key) { Node<K V>[] tab; Node<K V> first e; int n; K k; //(n-1)&hash：当前key可能在的桶索引，put操作时也是将Node存放在index=(n-1)&hash位置 //主要逻辑：如果table[index]处节点的key就是要找的key则直接返回该节点； //否则：如果在table[index]位置进行搜索，搜索是否存在目标key的Node：这里的搜索又分两种： //链表搜索和红黑树搜索，具体红黑树的查找就不展开了，红黑树是一种弱平衡(相对于AVL)BST， //红黑树查找、删除、插入等操作都能够保证在O(lon(n))时间复杂度内完成，红黑树原理不在本文 //范围内，但是大家要知道红黑树的各种操作是可以实现的，简单点可以把红黑树理解为BST， //BST的查找、插入、删除等操作的实现在之前的博客中有java实现代码，实际上红黑树就是一种平 //衡的BST if ((tab = table) != null && (n = tab.length) > 0 && (first = tab[(n - 1) & hash]) != null) { if (first.hash == hash && // always check first node ((k = first.key) == key || (key != null && key.equals(k)))) return first;//一次就匹配到了，直接返回，否则进行搜索 if ((e = first.next) != null) { if (first instanceof TreeNode) //红黑树搜索/查找 return ((TreeNode<K V>)first).getTreeNode(hash key); do { //链表搜索(查找) if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k)))) return e;//找到了就返回 } while ((e = e.next) != null); } } return null;//没找到，返回null }

get函数实质就是进行链表或者红黑树遍历搜索指定key的节点的过程；另外需要注意到HashMap的get函数的返回值不能判断一个key是否包含在map中，get返回null有可能是不包含该key；也有可能该key对应的value为null。HashMap中允许key为null，允许value为null。

微信公众号“菜鸟名企梦”期待你的关注，公众号后台回复“资料”即可获得4T优质精选高清全套完整不加密学习资料（你想找的资料基本都有~）以及博主学习时使用的精品资料一份。4T资料涵盖各个求职方向，并且每一个方向都有对应可写入简历的大型项目。

个人理解hashmap底层实现原理（一篇文章彻底读懂HashMap之HashMap源码解析）(2)

网站首页

返回栏目

个人理解hashmap底层实现原理（一篇文章彻底读懂HashMap之HashMap源码解析）

猜您喜欢：

相关文章