De acordo com as Leis 12.965/2014 e 13.709/2018, que regulam o uso da Internet e o tratamento de dados pessoais no Brasil, ao me inscrever na newsletter do portal DICAS-L, autorizo o envio de notificações por e-mail ou outros meios e declaro estar ciente e concordar com seus Termos de Uso e Política de Privacidade.
Colaboração: Rubens Queiroz de Almeida
Data de Publicação: 11 de Novembro de 1998
Em uma mensagem já veiculada nesta lista, eu expliquei o processo de automação de envio de mensagens e manutenção do Web site (http://www.dicas-l.com.br/dicas-l/dicas-l/970701.html).
Naquela época, eu não fazia ainda a conversão automática das URLs para torná-las em referências reais. Eu apenas escrevia a URL, mas a página HTML correspondente não permitia que se clicasse sobre ela para se dirigir ao site mencionado.
Para resolver este problema, eu passei a utilizar uma versão modificada do perl script asc2html, incluso nesta mensagem. Eu removi algumas coisas e criei um outro perlscript, chamado por mim urlconverter, também incluído nesta mensagem.
Desta forma, quem visita o site pode ver que as páginas Web da lista Dicas-L estão mais amigáveis :-)
A seguir, os scripts:
ASC2HTML #! /usr/bin/perl # # pre --- produced pre-formatted HTML text # # Author: Oscar Nierstrasz (June 25, 1993) # 4.8.93 -- incorporated url'href. foreach $file (@ARGV) { print "<TITLE>Asci file: $file</TITLE>\n<PRE>\n"; while(<>) { study; s/&/&/g; s/</</g; s/>/>/g; &url'href; print; } print "</PRE>\n"; } # Try to recognize URLs and ftp file indentifiers and convert them into HREFs: # This routine is evolving. The patterns are not perfect. # This is really a parsing problem, and not a job for perl ... # It is also generally impossible to distinguish ftp site names # from newsgroup names if the ":<directory>" is missing. # An arbitrary file name ("runtime.pl") can also be confused. sub url'href { # study; # doesn't speed things up ... # to avoid special cases for beginning & end of line s|^|>>>|; s|$|<<<|; # URLS: s|(news:[\w.]+)|<A HREF="$&">$&</A>|g; s|(http:[\w/.:+\-]+)|<A HREF="$&">$&</A>|g; s|(file:[\w/.:+\-]+)|<A HREF="$&">$&</A>|g; s|(ftp:[\w/.:+\-]+)|<A HREF="$&">$&</A>|g; s|(wais:[\w/.:+\-]+)|<A HREF="$&">$&</A>|g; s|(gopher:[\w/.:+\-]+)|<A HREF="$&">$&</A>|g; s|(telnet:[\w/.:+\-]+)|<A HREF="$&">$&</A>|g; # s|(\w+://[\w/.:+\-]+)|<A HREF="$&">$&</A>|g; # catch some newsgroups to avoid confusion with sites: s|([^\w\-/.:@>])(alt\.[\w.+\-]+[\w+\-]+)|$1<A HREF="news:$2">$2</A>|g; s|([^\w\-/.:@>])(bionet\.[\w.+\-]+[\w+\-]+)|$1<A HREF="news:$2">$2</A>|g; s|([^\w\-/.:@>])(bit\.[\w.+\-]+[\w+\-]+)|$1<A HREF="news:$2">$2</A>|g; s|([^\w\-/.:@>])(comp\.[\w.+\-]+[\w+\-]+)|$1<A HREF="news:$2">$2</A>|g; s|([^\w\-/.:@>])(gnu\.[\w.+\-]+[\w+\-]+)|$1<A HREF="news:$2">$2</A>|g; s|([^\w\-/.:@>])(misc\.[\w.+\-]+[\w+\-]+)|$1<A HREF="news:$2">$2</A>|g; s|([^\w\-/.:@>])(news\.[\w.+\-]+[\w+\-]+)|$1<A HREF="news:$2">$2</A>|g; s|([^\w\-/.:@>])(rec\.[\w.+\-]+[\w+\-]+)|$1<A HREF="news:$2">$2</A>|g; # FTP locations (with directory): s|(anonymous@)([a-zA-Z][\w.+\-]+\.[a-zA-Z]{2,}):(\s*)([\w+\-/.]+)|$1<A HREF="file://$2/$4">$2:$4</A>$3|g; s|(ftp@)([a-zA-Z][\w.+\-]+\.[a-zA-Z]{2,}):(\s*)([\w+\-/.]+)|$1<A HREF="file://$2/$4">$2:$4</A>$3|g; s|([^\w\-/.:@>])([a-zA-Z][\w.+\-]+\.[a-zA-Z]{2,}):(\s*)([\w+\-/.]+)|$1<A HREF="file://$2/$4">$2:$4</A>$3|g; # NB: don't confuse an http server with a port number for # an FTP location! # internet number version: s|([^\w\-/.:@])(\d{2,}\.\d{2,}\.\d+\.\d+):([\w+\-/.]+)|$1<A HREF="file://$2/$3">$2:$3</A>|g; # just the site name (assume two dots): s|([^\w\-/.:@>])([a-zA-Z][\w+\-]+\.[\w.+\-]+\.[a-zA-Z]{2,})([^\w\-/.:!])|$1<A HREF="file://$2">$2</A>$3|g; # NB: can be confused with newsgroup names! # <site>.com has only one dot: s|([^\w\-/.:@>])([a-zA-Z][\w.+\-]+\.com)([^\w\-/.:])|$1<A HREF="file://$2">$2</A>$3|g; # just internet numbers: s|([^\w\-/.:@])(\d+\.\d+\.\d+\.\d+)([^\w\-/.:])|$1<A HREF="file://$2">$2</A>$3|g; # unfortunately inet numbers can easily be confused with # european telephone numbers ... s|^>>>||; s|<<<$||; } _END_ -------------------- URLCONVERTER #! /usr/bin/perl # # pre --- produced pre-formatted HTML text # # Author: Oscar Nierstrasz (June 25, 1993) # 4.8.93 -- incorporated url'href. foreach $file (@ARGV) { while(<>) { study; &url'href; print; } } # Try to recognize URLs and ftp file indentifiers and convert them into HREFs: # This routine is evolving. The patterns are not perfect. # This is really a parsing problem, and not a job for perl ... # It is also generally impossible to distinguish ftp site names # from newsgroup names if the ":<directory>" is missing. # An arbitrary file name ("runtime.pl") can also be confused. sub url'href { # study; # doesn't speed things up ... # to avoid special cases for beginning & end of line s|^|>>>|; s|$|<<<|; # URLS: s|(news:[\w.]+)|<A HREF="$&">$&</A>|g; s|(http:[\w/.:+\-]+)|<A HREF="$&">$&</A>|g; s|(file:[\w/.:+\-]+)|<A HREF="$&">$&</A>|g; s|(ftp:[\w/.:+\-]+)|<A HREF="$&">$&</A>|g; s|(wais:[\w/.:+\-]+)|<A HREF="$&">$&</A>|g; s|(gopher:[\w/.:+\-]+)|<A HREF="$&">$&</A>|g; s|(telnet:[\w/.:+\-]+)|<A HREF="$&">$&</A>|g; # s|(\w+://[\w/.:+\-]+)|<A HREF="$&">$&</A>|g; # catch some newsgroups to avoid confusion with sites: s|([^\w\-/.:@>])(alt\.[\w.+\-]+[\w+\-]+)|$1<A HREF="news:$2">$2</A>|g; s|([^\w\-/.:@>])(bionet\.[\w.+\-]+[\w+\-]+)|$1<A HREF="news:$2">$2</A>|g; s|([^\w\-/.:@>])(bit\.[\w.+\-]+[\w+\-]+)|$1<A HREF="news:$2">$2</A>|g; s|([^\w\-/.:@>])(comp\.[\w.+\-]+[\w+\-]+)|$1<A HREF="news:$2">$2</A>|g; s|([^\w\-/.:@>])(gnu\.[\w.+\-]+[\w+\-]+)|$1<A HREF="news:$2">$2</A>|g; s|([^\w\-/.:@>])(misc\.[\w.+\-]+[\w+\-]+)|$1<A HREF="news:$2">$2</A>|g; s|([^\w\-/.:@>])(news\.[\w.+\-]+[\w+\-]+)|$1<A HREF="news:$2">$2</A>|g; s|([^\w\-/.:@>])(rec\.[\w.+\-]+[\w+\-]+)|$1<A HREF="news:$2">$2</A>|g; # FTP locations (with directory): s|(anonymous@)([a-zA-Z][\w.+\-]+\.[a-zA-Z]{2,}):(\s*)([\w+\-/.]+)|$1<A HREF="file://$2/$4">$2:$4</A>$3|g; s|(ftp@)([a-zA-Z][\w.+\-]+\.[a-zA-Z]{2,}):(\s*)([\w+\-/.]+)|$1<A HREF="file://$2/$4">$2:$4</A>$3|g; s|([^\w\-/.:@>])([a-zA-Z][\w.+\-]+\.[a-zA-Z]{2,}):(\s*)([\w+\-/.]+)|$1<A HREF="file://$2/$4">$2:$4</A>$3|g; # NB: don't confuse an http server with a port number for # an FTP location! # internet number version: s|([^\w\-/.:@])(\d{2,}\.\d{2,}\.\d+\.\d+):([\w+\-/.]+)|$1<A HREF="file://$2/$3">$2:$3</A>|g; # just the site name (assume two dots): s|([^\w\-/.:@>])([a-zA-Z][\w+\-]+\.[\w.+\-]+\.[a-zA-Z]{2,})([^\w\-/.:!])|$1<A HREF="file://$2">$2</A>$3|g; # NB: can be confused with newsgroup names! # <site>.com has only one dot: s|([^\w\-/.:@>])([a-zA-Z][\w.+\-]+\.com)([^\w\-/.:])|$1<A HREF="file://$2">$2</A>$3|g; # just internet numbers: s|([^\w\-/.:@])(\d+\.\d+\.\d+\.\d+)([^\w\-/.:])|$1<A HREF="file://$2">$2</A>$3|g; # unfortunately inet numbers can easily be confused with # european telephone numbers ... s|^>>>||; s|<<<$||; } _END_ --------------------
This policy contains information about your privacy. By posting, you are declaring that you understand this policy:
This policy is subject to change at any time and without notice.
These terms and conditions contain rules about posting comments. By submitting a comment, you are declaring that you agree with these rules:
Failure to comply with these rules may result in being banned from submitting further comments.
These terms and conditions are subject to change at any time and without notice.
Comentários