Paste dirty text in tinymce

Hi, i've noticed a strange behavior in tiny:
one of our users converted a pdf in simple text file and then copy/pasted the text into tiny. He made some changes and saved it.
The view renders the text correctly, but when i try to edit that page, the widget breaks with this error:

*** ValueError: All strings must be XML compatible: Unicode or ASCII, no NULL bytes or control characters

The problem is that converted text has a control character (\x0c) that is invisible in the editor (unless you open it as source) and makes the object in a broken state because we can't edit it from the interface.

This problem happens also if i enable "paste as text" behavior in tiny. I thought that this feature was similar to the old beloved "paste as simple text" in previous tiny versions, but it seems not.

How can i avoid these problems?
Is there a way to safely customize tiny/mockup and perform a stricter validation, or it's better clean the value in TextareaWidget before set the value?

I think that this option just asks the browser to change the default format for the clipboard, and that's all. It is up to the browser to do the rest.
For the browser I use currently it zaps the images and remove html tags when asked for plain text but it does not strip invalid Xml characters (a Form Feed is a valid text character, and probably a valid html character - I lack the patience to read the w3c spec)

As of how to best fix this problem, you could write a Tinymce plugin but is it the right fix, probably not but I am not sure.

Anyway, here is one that seems to work for Plone 5.1.0 -> 5.1.2

(function () {
var cleanHtml = (function () {
  'use strict';

  var PluginManager = tinymce.util.Tools.resolve('tinymce.PluginManager');

  PluginManager.add('cleanHtml', function (editor, url) {

        editor.on('PastePreProcess', function(e) {
                e.content = e.content.replace(/[^\x09\x0A\x0D\x20-\xff\x85\u00A0-\uD7FF\uE000-\uFDCF\uFDF0-\uFFFD]/gm,'');

  function Plugin () {
  return Plugin;