Punctuation Removal

Standardization Definition

Punctuation Removal
Description

The Punctuation Removal standardization definition removes punctuation characters.

Examples Input Output
100 Main St. Apt #100 100 Main St Apt 100
(919) 541-2777 ext. 500 919 541-2777 ext 500
「文字」 文字
Client_Code ClientCode
Remarks

Punctuation characters are defined as characters in the following Unicode general categories:

  • Punctuation, Open (Ps)
  • Punctuation, Close (Pe)
  • Punctuation, Initial quote (Pi)
  • Punctuation, Final quote (Pf)
  • Punctuation, Connector (Pc)
  • Punctuation, Other (Po)
To remove symbols, use the Symbol Removal standardization definition.