正则表达式

1.元字符

1.1 ? 限定符

used?

可匹配: use、used 作用: 问号前面的 d 字符要出现 0 次或者 1 次，d 可有可无，且至多出现 1 次

1.2 * 限定符

ab*c

可匹配: ac、abc、abbbbbbc 作用: * 前面的可以出现 0 次或者多次，b 可有可无，且至多出现 n 次

1.3 + 限定符

ab+c

可匹配: abc、abbbbbbc，不可匹配ac 作用: + 前面的可以出现 1 次或者多次，必须有 b，且可出现 n 次

1.4 {} 限定符

ab{6}c、ab{2,}

第一个可匹配: abbbbbbc 第二个可匹配: abbbc、abbc、abbbbc... 作用: 限定字符出现的区间

1.5 () 限定符（原子组）

(ab)+

可匹配: abababac、abc、ababc 作用: 匹配多次字符串的出现

1.6 | 或运算符

a (cat|dog)

可匹配: a cat、a dog 作用: 不言而喻

1.7 [] 字符类（原子表）

[abc]+

可匹配: abc、aabbcc 作用: 只能取方括号里面的字符的字符串

[a-z]+

可匹配: abc、tiger、aabbcc、dog

[a-zA-Z]+

可匹配: AbC、aBf

[a-zA-Z0-9]+

可匹配: 12Ab、3BcC

1.8 ^ 限定符（用于原子表）

[^0-9]

可匹配: aBc、a i 作用: 匹配所有的非数字字符

1.9 \d

\d+

可匹配: 4562318、545 作用: 等同于 [0-9]+，匹配所有数字字符

1.10 \w

\w+

可匹配: his name is yuan_xin_yue , he is 12 years old 作用: 匹配所有英文字母、数字加上下划线

1.11 \s

\s

可匹配: heisaverygoodman 作用: 匹配所有空白符（包括 Tab 和换行符）

1.12 \D、\W、\S

\D、\W、\S

作用: 可匹配非数字字符、非单词字符、非空白字符

1.13 . 限定符

.*

可匹配: this is a dog、i think you are right 作用: 可匹配任意字符，除了换行符

1.14 ^、$ 限定符

^a、b$

第一个可匹配: abc、alook、about、a 第二个可匹配: adfb、ioob、b 作用: 匹配行首行尾的字符

1.15 \b

\bthis is a example\b

作用: 标注字符的边界（换行排除了）

1.16 贪婪与懒惰匹配

贪婪变懒惰实例：

<.+>

可匹配: this is a sample text 不匹配<>内字符串原因: .会匹配任意字符，自然也会匹配' > '这个字符 修改方法: <.+?>，切换为懒惰匹配

2.深入正则

2.1 正则的一些应用

（最简单）提取数字

let hd = 'houdunren2200hdcms9988'
let nums = [...hd].filter(a => !Number.isNaN(parseInt(a)))
console.log(nums.join('')) // 22009988

// 正则表达式
console.log(hd.match(/\d/g).join('')) // 22009988

1
2
3
4
5
6

2.2 字面量创建正则表达式

/正则表达式/.test(字符串)：字面量形式创建正则表达式，返回值为布尔值
字面量不能查找变量，但是使用 eval(/${a}/).test(字符串) 可以

let hd = 'houdunren.com'
console.log(/u/.test(hd)) // true

let a = 'u'
console.log(/a/.test(hd)) // false，不能查变量

console.log(eval(`/${a}/`).test(hd)) // true

1
2
3
4
5
6
7

2.3 对象创建正则表达式

可以使用变量了：

let hd = 'houdunren.com'
let a = 'ou'
let reg = new RegExp(a, 'g') // 第一个参数为变量，第二个为匹配方式
console.log(reg.test(hd)) // true

1
2
3
4

实战：输入正则或关键词高亮某段字符串：

let con = prompt('请输入要检测的内容，支持正则')
let reg = new RegExp(con, 'g')
let div = document.querySelector('div')
// replace 第一个参数是正则规则，第二个参数是被替换的字符串
div.innerHTML = div.innerHTML.replace(reg, search => {
    return `<span style="color: red">${search}</span>`
})

1
2
3
4
5
6
7

2.4 转义

对象创建正则表达式缺点：字符串的自动转化：

let price = 23.34
let reg = new RegExp('\d+\.\d+')
console.log(reg.test(price)) // 这里为 false

// new RegExp 里面的字符串会自动转换为 ('d+.d+')
let test = 888888@122192
console.log(reg.test(test)) // true

1
2
3
4
5
6
7

解决办法：多加一个 \

let reg = new RegExp('\\d+\\.\\d+')

字面量创建的字符串里面的转义：

let url = 'https://www.houdunren.com'
console.log(/https?:\/\/\w+\.\w+/.test(url)) // true

1
2

2.5 字符边界约束

案例引入：

let hd = 'adf3dsfds'
console.log(/^\d/.test(hd)) // false，必须以数字开头
hd = '33'
console.log(/^\d$/.test(hd)) // false，起始和结束都为某数字，即只有一个数字

1
2
3
4

如果不加边界约束，纵使限定了类似 {3,6} 指定出现 3-6 次，也能匹配到超过 6 次的字符串。示例代码：

<body>
  <input type="text" name="user">
  <span></span>

  <script>
    document.querySelector('[name="user"]')
      .addEventListener('keyup', function () {
        let flag = this.value.match(/^[a-z]{3,6}$/) // 没加 ^$ 的话，asfdsf 都是能匹配成功
        document.querySelector('span').innerHTML = (flag ? '正确' : '失败')
      })
  </script>
</body>

1
2
3
4
5
6
7
8
9
10
11
12

2.6 match 用法

当没有指定全局匹配 /g，则匹配成功后返回一个数组（不成功返回 null），数组具体内容如下：

let hd = 'adfsf0sf'
let flag = hd.match(/\d/)
console.log(flag) // ['0', index: 5, input: 'adfsf0sf', groups: undefined]
console.log(flag[0]) // 0
console.log(flag.index) // 5
console.log(flag.input) // adfsf0sf
console.log(flag.groups) // undefined

1
2
3
4
5
6
7

当指定全局匹配后，匹配成功则返回一个数组：

let hd = 'houdunren 2020'
let flag = hd.match(/\d/g)
console.log(flag) // ['2', '0', '2', '0']
console.log(typeof(flag[0])) // string

1
2
3
4

判断是否成功匹配（返回布尔值）：

let hd = 'haodongxi'
let flag = hd.match(/\w/)
console.log(!!flag)

1
2
3

2.7 \d、\s 实例

// 元字符
let hd = 'houdunren 2020'
console.log(hd.match(/\d/g)) // ['2', '0', '2', '0']
console.log(hd.match(/\d+/g)) // ['2020']

hd = `张三:010-99999999,李四:020-88888888`
console.log(hd.match(/\d{3}-\d{7,8}/g)) // ['010-99999999', '020-88888888']

hd = 'houdunren 2020'
console.log(hd.match(/\D+/)) // ['houdunren', ...]

// \s、\S
console.log(/\s/.test(' hd\n')) // true
console.log(/\S/.test('\nhd')) // true

1
2
3
4
5
6
7
8
9
10
11
12
13
14

2.8 \w 实例：判断邮箱 / 判断用户名

<body>
  <script>
    let email = '23012@qq.com'
    console.log(email.match(/^\w+@\w+\.\w+$/))

    // 输入开头为字母的字符串，最短 5 个字符，最长 10 个字符
    let username = prompt('请输入用户名：')
    console.log(/^[a-z]\w{4,9}$/.test(username))
  </script>
</body>

1
2
3
4
5
6
7
8
9
10

2.9 精巧地匹配所有字符

[\s\S]、[\d\D] 这样子的原子表，可以精巧地匹配所有字符，弥补了 . 修饰符不能匹配换行符的问题

<body>
  <script>
    let hd = `
    <span>
      houdunren @@@@
      hdcms
    </span>  
  `
    console.log(hd.match(/<span>[\s\S]+<\/span>/)[0]) // 全部匹配了，包括换行符
    console.log(hd.match(/.+/)[0]) // 没有全部匹配，不包括换行符
  </script>
</body>

1
2
3
4
5
6
7
8
9
10
11
12

2.10 i 与 g 模式修正符

i：不区分大小写
g：全局匹配

<body>
  <script>
    let hd = 'hoUdUnren'
    console.log(hd.match(/u/)) // null
    console.log(hd.match(/u/i)) // 匹配成功，不区分大小写
    console.log(hd.match(/U/g)) // ['U', 'U']
    console.log(hd.match(/u/ig)) // ['U', 'U']

    console.log(hd.replace(/u/gi, '@')) // ho@d@nren
  </script>
</body>

1
2
3
4
5
6
7
8
9
10
11

2.11 m 多行匹配修正符（优雅）

m：多行匹配

重要思路：

map() 进行遍历时，是能 返回原数组 的，且原数组里面每个元素的值是能够修改的。例如：原数组是 ['xxx', 'xxx', 'xxx']，可以在遍历时将 'xxx' 变成一个对象然后返回，这样原数组就变成了 [{xx: xxx}, {xx: xxx}, {xx: xxx}]

<body>
  <script>
    let hd = `
    #1 js,200元 #
    #2 php,300元 #
    #9 houdunren.com # 后盾人
    #3 node.js,180元 #
  `
  // [{name: 'js', price: '200元'}]
  // 下述匹配有几个问题：'.+' 包括后面的 '#'，\s 包括第二行的空格
  // console.log(hd.match(/\s*#\d+\s+.+\s+#\s/g))
  // 多行匹配，每一行单独对待，成功匹配 #1 #2 #3 的内容
  let lessons = hd.match(/^\s*#\d+\s+.+\s+#$/gm).map(v => {
    v = v.replace(/\s*#\d+\s*/, '').replace(/\s+#/, '')
    // console.log(v) // "xx,xxx" 样式的字符串
    // console.log(v.split(',')) // [xx,xxx] 样式的数组
    let [name, price] = v.split(',') // 相当于 name: 'xxx', price: 'xxx'
    return {name, price} // 将原先['xxx', 'xxx', 'xxx'] 中的 'xxx' 变成一个 {xxx: xxx, xxx: xxx}
  })
  // console.log(lessons)
  console.log(JSON.stringify(lessons, null, 2))
  /*
  输出结果：
  [
    {
      "name": "js",
      "price": "200元"
    },
    {
      "name": "php",
      "price": "300元"
    },
    {
      "name": "node.js",
      "price": "180元"
    }
  ]
  */
  </script>
</body>

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40

无注释版：

<body>
  <script>
    let hd = `
    #1 js,200元 #
    #2 php,300元 #
    #9 houdunren.com # 后盾人
    #3 node.js,180元 #
  `
  let lessons = hd.match(/\s*#\w+\s+.+\s+#$/gm).map(v => {
    v = v.replace(/\s*#\d+\s*/, '').replace(/\s*#/, '')
    let [name, price] = v.split(',')
    return {name, price}
  })
  console.log(JSON.stringify(lessons, null, 2))
  </script>
</body>

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

2.12 汉字与字符属性 /u

匹配类别“字母”中的单个代码点：/p{L}/gu

let hd = 'houdunren2010.不断发布教程，加油！@@@3'
console.log(hd.match(/\p{L}/gu)) // ['h', 'o', 'u', 'd', 'u', 'n', 'r', 'e', 'n', '不', '断', '发', '布', '教', '程', '加', '油']

1
2

匹配符号：/p{P}/gu

let hd = 'houdunren2010.不断发布教程，加油！@@@3'
console.log(hd.match(/\p{P}/gu)) // ['.', '，', '！', '@', '@', '@']

1
2

匹配汉字、片假名：（具体查看：支持的脚本 (unicode.org)open in new window）

let hd = 'houdunren2010.不断发布教程，加油！@@@3'
console.log(hd.match(/\p{sc=Han}/gu)) // ['不', '断', '发', '布', '教', '程', '加', '油']

1
2

2.13 lastIndex 属性的作用

lastIndex：设置正则表达式开始搜索的位置

每使用一次 reg.exec('xxx')，就能获得一个匹配值及其 index、input 等属性，再次使用能获得下一个匹配值及属性，同时 lastIndex 也会自动加一，若未获得下一个匹配值则返回 null 且重置 lastIndex 为 0（一定要开启全局匹配！）
利用 xx.match(/\w/g) 进行匹配后，得到的是一个数组，找不到匹配字符串 input、匹配的下标 index、group 等属性，若想要得到这些详细属性，可以利用 lastIndex：

<body>
  <script>
    let hd = 'houdunren'
    // console.log(hd.match(/\w/g)) // 这个匹配的是一个数组，没有 index、input、group 等属性
    let reg = /\w/g
    // 每执行一次 exec，就输出一个匹配的值
    console.log(reg.lastIndex) // 0
    console.log(reg.exec(hd)) // ['h', index: 0, input: 'houdunren', groups: undefined]
    console.log(reg.lastIndex) // 1
    console.log(reg.exec(hd)) // ['o', index: 0, input: 'houdunren', groups: undefined]
    console.log(reg.lastIndex) // 2
    console.log(reg.exec(hd)) // ['u', index: 0, input: 'houdunren', groups: undefined]
  </script>
</body>

1
2
3
4
5
6
7
8
9
10
11
12
13
14

<body>
  <script>
    let hd = 'houdunren'
    // console.log(hd.match(/\w/g)) // 这个匹配的是一个数组，没有 index、input、group 等属性
    let reg = /\w/g
    while (res = reg.exec(hd)) {
      console.log(res)
      /*
      输出：
      ['h', index: 0, input: 'houdunren', groups: undefined]
      ['o', index: 1, input: 'houdunren', groups: undefined]
      ['u', index: 2, input: 'houdunren', groups: undefined]
      ['d', index: 3, input: 'houdunren', groups: undefined]
      ['u', index: 4, input: 'houdunren', groups: undefined]
      ['n', index: 5, input: 'houdunren', groups: undefined]
      ['r', index: 6, input: 'houdunren', groups: undefined]
      ['e', index: 7, input: 'houdunren', groups: undefined]
      ['n', index: 8, input: 'houdunren', groups: undefined]
      */
    }
  </script>
</body>

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22

2.14 y模式：粘连修饰符

y 模式必须要连续符合条件

基本：

<body>
  <script>
    let hd = 'udunren'
    let reg = /u/y
    console.log(reg.exec(hd)) // ['u', index: 0, input: 'udunren', groups: undefined]
    console.log(reg.lastIndex) // 1
    console.log(reg.exec(hd)) // null
    console.log(reg.lastIndex) // 0
  </script>
</body>

1
2
3
4
5
6
7
8
9
10

进阶：

<body>
  <script>
    let hd = `我的qq群是：1111111,9999999,8888888888我就是一个学前端的小白，网址是 iamyuan.com`
    let reg = /(\d+),?/y
    reg.lastIndex = 7
    // 下面输出结果还包括 input、index 等属性
    console.log(reg.exec(hd)) // 1111111
    console.log(reg.exec(hd)) // 9999999
    console.log(reg.exec(hd)) // 8888888888
    console.log(reg.exec(hd)) // null
  </script>
</body>

1
2
3
4
5
6
7
8
9
10
11
12

最终：

<body>
  <script>
    let hd = `我的qq群是：1111111,9999999,8888888888我就是一个学前端的小白，网址是 iamyuan.com`
    let reg = /(\d+),?/y
    reg.lastIndex = 7
    let qq = []
    while (res = reg.exec(hd)) {
      qq.push(res[1])
    }
    console.log(qq)
  </script>
</body>

1
2
3
4
5
6
7
8
9
10
11
12

2.15 原子表基本使用

xxxx-xx-xx 与 xxxx/xx/xx 形式匹配：

let tel = '2022/03/23'
// let tel = '2022-03-23'
let reg = /^\d{4}[-\/]\d{2}[-\/]\d{2}$/
console.log(tel.match(reg))

1
2
3
4

上面匹配仍有问题，因为 xxxx-xx/xx 的形式也能匹配到，应该修改为：

let reg = /^\d{4}([-\/])\d{2}\1\d{2}$/

2.16 原子表区间匹配

原子表区间 && 贪婪匹配实例：

let hd = 'houdunren'
console.log(hd.match(/[a-z]/g)) // ['h', 'o', 'u', 'd', 'u', 'n', 'r', 'e', 'n']
// 贪婪匹配
console.log(hd.match(/[a-z]+/g)) // ['houdunren']

1
2
3
4

匹配字母开头、后跟字母数字下划线案例：

// 以字母开始，后跟字母数字下划线皆可
let input = document.querySelector(`[name="username"]`)
input.addEventListener('keyup', function () {
    console.log(this.value.match(/^[a-z]\w{3,6}$/i))
})

1
2
3
4
5

2.17 排除匹配

let hd = `张三:010-99999999,李四:020-88888888`
// 排除方法得到中文
console.log(hd.match(/[^\d:\-,]+/g)) // ['张三','李四']
console.log(hd.match(/\p{sc=Han}+/gu)) // ['张三','李四']

hd = '(houdunren).+'
console.log(hd.match(/[()]/gi)) // 匹配括号
console.log(hd.match(/[.+]/gi)) // 匹配点和加号

1
2
3
4
5
6
7
8

2.18 正则操作 DOM 元素

<body>
  <p>后盾人</p>
  <h1>houdunren.com</h1>
  <h2>hdcms.com</h2>
  <h3></h3>
</body>
<script>
  // 把标题元素删除
  let body = document.body
  let reg = /<(h[1-6])>[\s\S]*<\/\1>/ig
  body.innerHTML = body.innerHTML.replace(reg, '')
</script>

1
2
3
4
5
6
7
8
9
10
11
12

2.19 原子组

let hd = `
    <h1>houdunren</h1>
    <h2>hdcms</h2>
  `
// 设立了原子组后，自动将组名设定为 \1 \2 \3...
let reg = /<(h[1-6])>[\s\S]*<\/\1>/ig // ['<h1>houdunren</h1>', '<h2>hdcms</h2>']
console.log(hd.match(reg))

1
2
3
4
5
6
7

2.20 邮箱验证

let mail = document
.querySelector(`[name='mail']`)
.addEventListener('keyup', function () {
    let reg = /^[\w-]+@([\w-]+\.)+(com|org|cc|cn|net)$/i
    document.querySelector('span').innerHTML = reg.test(this.value)
        ? '正确的'
    : '错误的'
})

1
2
3
4
5
6
7
8

2.21 原子组引用完成替换操作

替换 h2 标签为 p 标签：

let hd = `
  <h1>houdunren</h1>
  <span>后盾人</span>
  <h2>hdcms</h2>
`
let reg = /<(h[1-6])>([\s\S]+)<\/\1>/ig
console.log(hd.replace(reg, `<p>$2</p>`)) // 这里的 $2 就是第二个原子组 ([\s\S]+)
/*
  输出：
    <p>houdunren</p>
    <span>后盾人</span>
    <p>hdcms</p>
*/

1
2
3
4
5
6
7
8
9
10
11
12
13

原子组的下标只要从左往右数就好了

str.replace(reg , fun) 中，fun 的参数 p0、p1、p2... 分别代表匹配结果、第一个原子组、第二个原子组...

let hd = `
  <h1>houdunren</h1>
  <span>后盾人</span>
  <h2>hdcms</h2>
`
let reg = /<(h[1-6])>([\s\S]+)<\/\1>/ig
hd.replace(reg, (p0, p1, p2) => {
    // console.log(p0) // <h1>houdunren</h1> <h2>hdcms</h2>，匹配的结果
    console.log(p1) // h1 h2 第一个原子组
    // console.log(p2) // houdunren hdcms // 第二个原子组
})

1
2
3
4
5
6
7
8
9
10
11

3. 再深入一下正则

3.1 嵌套分组与不记录组

当正则有两个或以上的原子组，在使用 replace 的回调函数中的第二个、第三个、...参数时，若想忽略其中某一个原子组，只需要在括号里面面加上 ?:，如 (?:com|org|cn)；此外，该方法在 replace 方法的第二个参数也能起作用，如 hd.replace(reg, '$2')

当不想记录某些组，就可以加上 ?: 忽略之：

let reg = /https?:\/\/((?:w+\.)?\w+\.(?:com|org|cn))/i
// 忽略了 www. 这一个原子组
console.log(reg.exec(hd)) // ['https://www.baidu.com', 'www.baidu.com', index: 5, input: '\n    https://www.baidu.com\n    http://baidu.com\n  ', groups: undefined]
console.log(hd.match(reg)) // ['https://www.baidu.com', 'www.baidu.com', index: 5, input: '\n    https://www.baidu.com\n    http://baidu.com\n  ', groups: undefined]

1
2
3
4

重要！

let hd = `
   https://www.baidu.com
    http://baidu.com
    https://hdcms.com
`
let reg = /https?:\/\/((?:w+\.)?\w+\.(?:com|org|cn))/ig

let urls = []
// 使用 reg.exec 是为了获得有详细信息的数组
// res[0] 是匹配后的原结果，没有过滤原子组
while (res = reg.exec(hd)) {
    urls.push(res[0])
}
console.log(urls) // ['https://www.baidu.com', 'http://baidu.com', 'https://hdcms.com']

let urls1 = []
// res[1] 匹配的是第一个原子组
while (res = reg.exec(hd)) {
    urls1.push(res[1])
}
// 第一个原子组
console.log(urls1) // ['www.baidu.com', 'baidu.com', 'hdcms.com']

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22

3.2 密码验证-多个正则验证

思路：可以利用 every 遍历 regs 数组，将 regs 数组的每个成员都 test 一下密码。如果都通过则密码验证通过，否则显示密码错误

let input = document.querySelector(`[name='password']`)
input.addEventListener('keyup', (e) => {
  const value = e.target.value
  const regs = [
      /^[a-z0-9]{5,10}$/i,
      /[A-Z]/,
      /[0-9]/
  ]
  let state = regs.every(e => e.test(value))
  console.log(state ? '正确' : '密码错误')
})

1
2
3
4
5
6
7
8
9
10
11

3.3 禁止贪婪

语法：

// 在启用贪婪的正则式子后面加上 ?，就能使之不贪婪
let hd = 'hddd'
console.log(/hd{2,3}?/.test(hd)) // 只能取到 2 个 d
console.log(/hd*?/.test(hd)) // 取不到 d
console.log(/hd+?/.test(hd)) // 取到 1 个 d
console.log(/hd??/.test(hd)) // 取不到 d

1
2
3
4
5
6

标签替换的禁止贪婪使用：

<body>
  <main>
    <span>houdunren</span>
    <span>hdcms.com</span>
    <span>houdunren.com</span>
  </main>
</body>

<script>
  const main = document.querySelector('main')
  const reg = /<span>([\s\S]+?)<\/span>/ig
  main.innerHTML = main.innerHTML.replace(reg, (v, p1) => { // p1 为第一个原子组，v 为匹配到的内容
    return `<h4 style="color: red;">后盾人-${p1}</h4>`
  })
</script>

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

3.4 matchAll 全局匹配

matchAll 用法：

let reg = /<(h[1-6])>([\s\S]+?)<\/\1>/ig
const body = document.body
// 使用 match 得不到匹配结果的详细信息，故采用 matchAll，它可以生成一个迭代器（可遍历）
const hd = body.innerHTML.matchAll(reg)
let contents = []
// 遍历的每一个元素具有完整信息
for (const iterator of hd) {
  contents.push(iterator[2])
}
console.table(contents)

1
2
3
4
5
6
7
8
9
10

适配低版本浏览器 - 方法一：递归

<body>
  <h1>houdunren.com</h1>
  <h2>hdcms.com</h2>
  <h1>后盾人</h1>
</body>

<script>
  // matchAll
  String.prototype.matchAll = function (reg) {
    let res = this.match(reg)
    if (res) {
      let str = this.replace(res[0], '^'.repeat(res[0].length))
      let match = str.matchAll(reg) || []
      return [res, ...match]
    }
  }
  let body = document.querySelector('body').innerHTML
  let search = body.matchAll(/<(h[1-6])>[\s\S]+?<\/\1>/i)
  console.log(search)
</script>

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

方法二：

<body>
  <h1>houdunren.com</h1>
  <h2>hdcms.com</h2>
  <h1>后盾人</h1>
</body>

<script>
  function search(string, reg) {
    let result = []
    while(res = reg.exec(string)) {
      result.push(res)
    }
    return result
  }
  let matchs = search(document.body.innerHTML, /(h[1-6])[\s\S]+?<\/\1>/ig)
  console.log(matchs)
</script>

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

3.5 search、match 用法

search 基本用法：返回索引值

// 字符串
let hd = 'houdunren.com'
console.log(hd.search('h')) // 0

// 正则表达式
let hd = 'houdunren.com'
console.log(hd.search(/u/)) // 2

1
2
3
4
5
6
7

match 基本用法：

let hd = `
  https://hdcms.com
  https://www.sina.com.cn
  https://www.houdunren.com
`
let reg = /https?:\/\/(?:\w+\.)?(?:\w+\.)+(?:com|cn|org|cc)/ig
// let matches = hd.matchAll(reg)
// for (iterator of matches) {
//   console.log(iterator)
// }
console.log(hd.match(reg)) // ['https://hdcms.com', 'https://www.sina.com.cn', 'https://www.houdunren.com']

1
2
3
4
5
6
7
8
9
10
11

3.6 字符串的拆分

let hd = '2020/09/12'
console.log(hd.split(/[-\/]/)) // ['2020', '09', '12']

1
2

3.7 $ 符在正则替换中的使用

let hd = '(010)99999999 (020)8888888'
let reg = /\((\d{3,4})\)(\d{7,8})/g
console.log(hd.replace(reg, '$1-$2')) // 010-99999999 020-8888888

hd = '=后盾人='
// amp; 就是匹配到的内容
console.log(hd.replace(/后盾人/, 'amp;-bilibili')) // =后盾人-bilibili=

hd = '%后盾人=='
// 

  
    
    
    
    正则表达式 | 学无止境
    
    
  
  
     就是匹配内容的前面的内容
console.log(hd.replace(/后盾人/, '

  
    
    
    
    正则表达式 | 学无止境
    
    
  
  
    ')) // %%==
// #39; 就是匹配内容的后面的内容
console.log(hd.replace(/后盾人/, "#39;")) // %====

// 总结
hd = '=后盾人='
console.log(hd.replace(/后盾人/, "#39;#39;amp;#39;#39;")) // ===后盾人===

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

3.8 amp; 的使用(高亮 html 中的某一内容)

const main = document.querySelector('body main')
main.innerHTML = main.innerHTML.replace(/教育/g, `<a href="https://www.baidu.com">amp;</a>`)

1
2

3.9 将 http 加上 s 且加上 www

<body>
  <main>
    <a style="color: red;" href="http://hdcms.com">
      开源系统
    </a>
    <a id="l1" href="http://houdunren.com">后盾人</a>
    <a href="http://yahoo.com">雅虎</a>
    <h4>http://www.hdcms.com</h4>
  </main>
</body>

<script>
  // 将 http 变成 https，将没有 www 加上 www，只对 hdcms 和 houdunren 有效
  const main = document.querySelector('body main')
  const reg = /(<a.*href=['"])(http)(:\/\/)(www\.)?(hdcms|houdunren)/ig
  main.innerHTML = main.innerHTML.replace(reg, (v, ...args) => {
    args[1]+='s'
    args[3] = args[3] || 'www.'
    return args.splice(0, 5).join('')
  })
</script>

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21

3.10 原子组别名

在原子组括号里最前面加上 ?<xxx>，其中 xxx 就是该原子组的别名

let hd = `
  <h1>houdunren</h1>
  <span>后盾人</span>
  <h2>hdcms</h2>
`
// ?<随便命名> - 给组命名
const reg = /<(h[1-6])>(?<content>.*?)<\/\1>/gi
// console.log(hd.replace(reg, '<h4>$2</h4>'))
console.log(hd.replace(reg, '<h4>lt;content></h4>'))

1
2
3
4
5
6
7
8
9

3.11 优雅 - 使用原子组别名优化正则

<body>
  <main>
    <a id="hd" href="https://www.houdunren.com">后盾人</a>
    <a href="https://www.hdcms.com">hdcms</a>
    <a href="https://www.sina.com.cn">新浪</a>
  </main>
</body>

<script>
  // [{link:'',title:''}]
  const main = document.querySelector('body main')
  const reg = /<a.*?href=(['"])(?<link>.*)\1>(?<title>.*?)<\/a>/ig
  const links = []
  for (const iterator of main.innerHTML.matchAll(reg)) {
    links.push(iterator['groups'])
  }
  console.dir(links)
  /* 数组形式
  0: {link: 'https://www.houdunren.com', title: '后盾人'}
  1: {link: 'https://www.hdcms.com', title: 'hdcms'}
  2: {link: 'https://www.sina.com.cn', title: '新浪'}
  */
</script>

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23

3.12 正则方法 test

// 字符串方法：match matchAll search replace
const mail = document.querySelector(`[name='email']`)
mail.addEventListener('keyup', e => {
  let value = e.target.value
  let flag = /^[\w-]+@(\w+\.)+(com|org|cn)$/i.test(value)
  console.log(flag)
})

1
2
3
4
5
6
7

3.13 正则方法 exec

let reg = /后盾人/g
const main = document.querySelector('body main')
let count = 0
while (res = reg.exec(main.innerHTML)) {
  count++
}
console.log(count) // 2

1
2
3
4
5
6
7

4.正则提高

4.1 ?= 断言匹配（后面是）

断言放在前面或后面表示在这个地方必须要有断言里的值，否则不匹配

<body>
  <main>
    后盾人不断分享视频教程，学习后盾人教程提升编程能力
  </main>
</body>

<script>
  let main = document.querySelector('main')
  let reg = /后盾人(?=教程)/g
  main.innerHTML = main.innerHTML.replace(reg, `<a href="https://houdunren.com">amp;</a>`)
  /*
  结果：左右有空格的 后盾人 变成可点击链接
  后盾人不断分享视频教程，学习 后盾人 教程提升编程能力
  */
</script>

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

4.2 使用断言规范价格

// 使价格都成为 xxx.00 的格式
let lessons = `
  js,200元,300次
  php,300.00元,100次
  node.js,180元，260次
`
let reg = /(\d+)(.00)?(?=元)/ig
lessons = lessons.replace(reg, (v, ...args) => {
  args[1] = args[1] || '.00'
  return args.splice(0, 2).join('')
})
console.log(lessons)

1
2
3
4
5
6
7
8
9
10
11
12

4.3 ?<= 断言匹配（前面是）

<body>
  <main>
    <a href="https://baidu.com">百度</a>
    <a href="https://yahoo.com">雅虎</a>
  </main>
</body>

<script>
  const main = document.querySelector('main')
  const reg = /(?<=href=(['"])).+(?=\1)/ig
  main.innerHTML = main.innerHTML.replace(reg, 'https://www.houdunren.com')
</script>

1
2
3
4
5
6
7
8
9
10
11
12

4.4 使用断言模糊电话号

let users = `
  向军电话: 12345678901
  后盾人电话: 98745675603
`
let reg = /(?<=\d{7})\d{4}/ig
users = users.replace(reg, v => {
  return '*'.repeat(v.length)
})
console.log(users)
/*
结果：
  向军电话: 1234567****
  后盾人电话: 9874567****
*/

1
2
3
4
5
6
7
8
9
10
11
12
13
14

4.5 ?! 断言匹配（后面不是）

let hd = 'houdunren2010hdcms'
let reg = /[a-z]+(?!\d+)$/i // 以字母结尾且后面不是数字 hdcms
console.log(hd.match(reg))

1
2
3

4.6 断言限制用户名关键词

const input = document.querySelector(`[name="username"]`)
input.addEventListener('keyup', function () {
  const reg = /^(?!.*向军.*)[a-z]{5,6}$/i // 从开头到末尾都不能有 向军
  console.log(this.value.match(reg))
})

1
2
3
4
5

4.7 ?<! 断言匹配（前面不是）

const hd = 'hdcms99houdunren'
let reg = /(?<!\d+)[a-z]+/i
console.log(hd.match(reg)) // hdcms,...

1
2
3

4.8 使用断言排除法统一数据

<body>
  <main>
    <a href="https://www.houdunren.com/1.jpg">1.jpg</a>
    <a href="https://oss.houdunren.com/2.jpg">2.jpg</a>
    <a href="https://cdn.houdunren.com/3.jpg">3.jpg</a>
    <a href="https://houdunren.com/4.jpg">后盾人</a>
  </main>
</body>

<script>
  const main = document.querySelector('main')
  const reg = /https:\/\/([a-z]+)?(?<!oss)\..+?(?=\/)/gi
  main.innerHTML = main.innerHTML.replace(reg, v => {
    return 'https://oss.houdunren.com'
  })
</script>

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

# 正则表达式

# 1.元字符

# 1.1 ? 限定符

# 1.2 * 限定符

# 1.3 + 限定符

# 1.4 {} 限定符

# 1.5 () 限定符（原子组）

# 1.6 | 或运算符

# 1.7 [] 字符类（原子表）

# 1.8 ^ 限定符（用于原子表）

# 1.9 \d

# 1.10 \w

# 1.11 \s

# 1.12 \D、\W、\S

# 1.13 . 限定符

# 1.14 ^、$ 限定符

# 1.15 \b

# 1.16 贪婪与懒惰匹配

# 2.深入正则

# 2.1 正则的一些应用

# 2.2 字面量创建正则表达式

# 2.3 对象创建正则表达式

# 2.4 转义

# 2.5 字符边界约束

# 2.6 match 用法

# 2.7 \d、\s 实例

# 2.8 \w 实例：判断邮箱 / 判断用户名

# 2.9 精巧地匹配所有字符

# 2.10 i 与 g 模式修正符

# 2.11 m 多行匹配修正符（优雅）

# 2.12 汉字与字符属性 /u

# 2.13 lastIndex 属性的作用

# 2.14 y模式：粘连修饰符

# 2.15 原子表基本使用

# 2.16 原子表区间匹配

# 2.17 排除匹配

# 2.18 正则操作 DOM 元素

# 2.19 原子组

# 2.20 邮箱验证

# 2.21 原子组引用完成替换操作

# 3. 再深入一下正则

# 3.1 嵌套分组与不记录组

# 3.2 密码验证-多个正则验证

# 3.3 禁止贪婪

# 3.4 matchAll 全局匹配

# 3.5 search、match 用法

# 3.6 字符串的拆分

# 3.7 $ 符在正则替换中的使用

# 3.8 amp; 的使用(高亮 html 中的某一内容)

# 3.9 将 http 加上 s 且加上 www

# 3.10 原子组别名

# 3.11 优雅 - 使用原子组别名优化正则

# 3.12 正则方法 test

# 3.13 正则方法 exec

# 4.正则提高

# 4.1 ?= 断言匹配（后面是）

# 4.2 使用断言规范价格

# 4.3 ?<= 断言匹配（前面是）

# 4.4 使用断言模糊电话号

# 4.5 ?! 断言匹配（后面不是）

# 4.6 断言限制用户名关键词

# 4.7 ?<! 断言匹配（前面不是）

# 4.8 使用断言排除法统一数据

正则表达式

1.元字符

1.1 ? 限定符

1.2 * 限定符

1.3 + 限定符

1.4 {} 限定符

1.5 () 限定符（原子组）

1.6 | 或运算符

1.7 [] 字符类（原子表）

1.8 ^ 限定符（用于原子表）

1.9 \d

1.10 \w

1.11 \s

1.12 \D、\W、\S

1.13 . 限定符

1.14 ^、$ 限定符

1.15 \b

1.16 贪婪与懒惰匹配

2.深入正则

2.1 正则的一些应用

2.2 字面量创建正则表达式

2.3 对象创建正则表达式

2.4 转义

2.5 字符边界约束

2.6 match 用法

2.7 \d、\s 实例

2.8 \w 实例：判断邮箱 / 判断用户名

2.9 精巧地匹配所有字符

2.10 i 与 g 模式修正符

2.11 m 多行匹配修正符（优雅）

2.12 汉字与字符属性 /u

2.13 lastIndex 属性的作用

2.14 y模式：粘连修饰符

2.15 原子表基本使用

2.16 原子表区间匹配

2.17 排除匹配

2.18 正则操作 DOM 元素

2.19 原子组

2.20 邮箱验证

2.21 原子组引用完成替换操作

3. 再深入一下正则

3.1 嵌套分组与不记录组

3.2 密码验证-多个正则验证

3.3 禁止贪婪

3.4 matchAll 全局匹配

3.5 search、match 用法

3.6 字符串的拆分

3.7 $ 符在正则替换中的使用

3.8 amp; 的使用(高亮 html 中的某一内容)

3.9 将 http 加上 s 且加上 www

3.10 原子组别名

3.11 优雅 - 使用原子组别名优化正则

3.12 正则方法 test

3.13 正则方法 exec

4.正则提高

4.1 ?= 断言匹配（后面是）

4.2 使用断言规范价格

4.3 ?<= 断言匹配（前面是）

4.4 使用断言模糊电话号

4.5 ?! 断言匹配（后面不是）

4.6 断言限制用户名关键词

4.7 ?<! 断言匹配（前面不是）

4.8 使用断言排除法统一数据