Lmst

How AI Is Redefining the Way We Find Content, by @Clearleft:

Original Garfield comic from November 13, 2021
Text replaced with lyrics from: Crawling

Transcript:
• Against My Will I Stand Beside My Own Reflection
• It's Haunting

The image consists of a comic strip with three panels, each featuring the same cat wearing a sweater, sitting in front of a ball of yarn. The panels are arranged vertically, with each panel spanning a third of the height of the image. In the first panel, the cat is sitting on the left side of the image. In the second panel, the cat is sitting in the middle section of the image. Finally, in the third panel, the cat is sitting on the right side of the image.

Each panel shows the cat wearing a sweater and staring at the ball of yarn. The panels are colored with different shades of orange and blue, creating a vibrant and visually appealing comic strip. The cat seems to be the main subject of the comic, with the sweater and yarn as the central objects.

cerberus costume...

#cerberus #dog #costume #spider #crawling #animatedgif

cerberus costume...

#cerberus #dog #costume #spider #crawling #animatedgif

cerberus costume...

#cerberus #dog #costume #spider #crawling #animatedgif

[Перевод] Тихая смерть robots.txt

Десятки лет robots.txt управлял поведением веб-краулеров. Но сегодня, когда беспринципные ИИ-компании стремятся к получению всё больших объёмов данных, базовый общественный договор веба начинает разваливаться на части. В течение трёх десятков лет крошечный текстовый файл удерживал Интернет от падения в хаос. Этот файл не имел никакого конкретного юридического или технического веса, и даже был не особо сложным. Он представляет собой скреплённый рукопожатием договор между первопроходцами Интернета о том, что они уважают пожелания друг друга и строят Интернет так, чтобы от этого выигрывали все. Это мини-конституция Интернета, записанная в коде. Файл называется robots.txt; обычно он находится по адресу вашвебсайт.com/robots.txt . Этот файл позволяет любому, кто владеет сайтом, будь то мелкий кулинарный блог или многонациональная корпорация, сообщить вебу, что на нём разрешено, а что нет. Какие поисковые движки могут индексировать ваш сайт? Какие архивные проекты могут скачивать и сохранять версии страницы? Могут ли конкуренты отслеживать ваши страницы? Вы сами решаете и объявляете об этом вебу. Эта система неидеальна, но она работает. Ну, или, по крайней мере, работала. Десятки лет основной целью robots.txt были поисковые движки; владелец позволял выполнять скрейпинг, а в ответ они обещали привести на сайт пользователей. Сегодня это уравнение изменилось из-за ИИ: компании всего мира используют сайты и их данные для коллекционирования огромных датасетов обучающих данных, чтобы создавать модели и продукты, которые могут вообще не признавать существование первоисточников. Файл robots.txt работает по принципу «ты — мне, я — тебе», но у очень многих людей сложилось впечатление, что ИИ-компании любят только брать. Cегодня в ИИ вбухано так много денег, а технологический прогресс идёт вперёд так быстро, что многие владельцы сайтов за ним не поспевают. И фундаментальный договор, лежащий в основе robots.txt и веба в целом, возможно, тоже утрачивает свою силу.

https://habr.com/ru/companies/ruvds/articles/987416/

#robotstxt #вебкраулер #crawling #openai #ruvds_перевод

#Garfield #music #Crawling
#bot #lyrics

Original Garfield comic from July 1, 2021
Text replaced with lyrics from: Crawling

Transcript:
• Confusing What Is Real
• Discomfort, Endlessly Has Pulled Itself Upon Me

The image is a three-panel comic strip featuring a cat and a dog. The cat is on the left, holding a blue and white bowl, with a thought bubble above its head stating, "Confronting what is real." The dog is on the right, with a thought bubble above its head saying, "I'm tired of pulling it's tail." The rest of the comic strip is filled with humor and situational scenarios, showcasing the cat and the dog's adventures together.

#Garfield #music #Crawling
#bot #lyrics

Original Garfield comic from May 3, 2021
Text replaced with lyrics from: Crawling

Transcript:
• Fear Is How I Fall
• Confusing What Is
• Real
• Discomfort, Endlessly Has Pulled Itself Upon Me
• Distracting, Reacting

This is a cartoon strip featuring three panels, each depicting a different scene involving a character named Garfield. In the first panel, Garfield is lying on the ground with his head resting on the floor. The second panel shows the character talking with another person while making a funny face. The third panel depicts Garfield sitting on the ground, looking down while the character from the first panel stands behind him. The strip is set against a gray background, and the panels are arranged in a vertical order.

Crawl budget determines how Google crawls and indexes your website pages. Managing it properly ensures that important content gets discovered quickly. Let’s explore simple strategies to improve SEO results!

Website: https://ondigitals.com/crawl-budget/
#ondigitals #ondigitalsagency #crawlbudget #crawling

#Garfield #music #Crawling
#bot #lyrics

Original Garfield comic from January 2, 2021
Text replaced with lyrics from: Crawling

Transcript:
• Crawling In My Skin
• These Wounds, They Will Not Heal

The image features a garfield comic strip titled "Crawling in My Skin". There are three panels in total, each depicting a different scenario.

In the first panel, a garfield is shown crawling in his skin while he is using a laptop computer. This panel captures a humorous moment and provides a visual representation of the garfield's crawling experience.

The second panel depicts a garfield looking at a computer screen. It shows the garfield's curiosity towards the device and his attempt to understand it. This panel adds an element of surprise and intrigue to the comic strip.

The third panel features a garfield using a computer mouse. This panel captures a more practical aspect of the garfield's interaction with technology.

Overall, the comic strip provides various illustrations of the garfield's experiences with technology, making it a visually engaging and entertaining piece.

#Garfield #music #Crawling
#bot #lyrics

Original Garfield comic from October 26, 2020
Text replaced with lyrics from: Crawling

Transcript:
• Against My Will I Stand Beside My Own Reflection
• It's Haunting
• How I Can't Seem
• To Find Myself Again

The comic strip is a collection of three panels, each depicting a different scene. In the first panel, a cat is lying on a chair, and a caption reads, "Agonist my will; Desiring my own reflection. How can I ever be happy when I can't even see my own reflection?". This panel is followed by a panel where the cat is seen sitting in front of a computer, and a caption reads, "The cat is staring at the computer screen. How can I ever be happy when I can't even see my own reflection?". In the third panel, the cat is seen sitting in a chair and appears to be enjoying his time. The caption in this panel reads, "The cat is napping on the chair. How can I ever be happy when I can't even see my own reflection?".

#Garfield #music #Crawling
#bot #lyrics

Original Garfield comic from April 30, 2020
Text replaced with lyrics from: Crawling

Transcript:
• Distracting, Reacting
• Against My Will I Stand Beside My Own Reflection
• It's Haunting

--------------
Original Text:
• Jon: Hey, Garfield... Have you seen the big suitcase?
• Garfield: You mean my new lunch tote?

The image is a comic strip featuring three panels. In the first panel, a man stands in front of a desk, with a suitcase beside him. The second panel shows the man placing his hand over the suitcase. The third panel shows the man pointing at the suitcase with his hand. The panels are arranged in a line, with each panel having a caption below it. The overall scene is light-hearted and humorous, as the man interacts with the suitcase.

From this summer (July 2nd) until today (Dec 22nd), the OpenAI GPTbot has fetched 2,659,115 pages from my Sundial demo calendar, which has a robots.txt telling crawlers to not bother, as there is an infinite number of pages in the calendar.

The furthest back their bot has reached so far is the year -1222, and the furthest in the future they have reached so far is the year 7776...

My accidental AI crawler tarpit keeps on serving pages.

#Garfield #music #Crawling
#bot #lyrics

Original Garfield comic from March 17, 2020
Text replaced with lyrics from: Crawling

Transcript:
• Discomfort, Endlessly Has
• Pulled Itself Upon Me
• Distracting, Reacting

--------------
Original Text:
• "Echo Point"
• Garfield Yawn!
• Voice: Z.
• Garfield: Don't get ahead!

The image is a vibrant comic strip that features a series of scenes depicting a Garfield-themed cat. The comic strip is split into three panels, each highlighting different aspects of the cat's adventures and misadventures.

In the first panel, the Garfield-themed cat is shown on his feet, walking on top of a hill. The cat appears to be contemplating his next move, possibly trying to scale the hill or looking for a way to descend the incline.

In the second panel, the cat is seen attempting to climb a tree, using its claws to grip the bark. It is at this point that a line of text is prominently displayed, reading "ECHO POINT." The cat is likely trying to communicate with the Echo Point or looking for a way to reach the tree.

The final panel shows the Garfield-themed cat successfully climbing the tree, reaching the Echo Point with ease. It seems that the cat has finally accomplished its goal, possibly to get the Echo Point to listen to him or to inform him about his surroundings.

📬 Wikipedia zieht der KI den Stecker raus
#Internet #KünstlicheIntelligenz #Netzpolitik #Crawling #CreativeCommons #Endowment #KI #Spendenbanner #WikimediaFoundation #Wikipedia https://sc.tarnkappe.info/46f786

I'm (slowly, stutteringly) writing a website link checker, purely to get a bit of practice in Rust. (No use of chatbots/LLMs at any point.)

It's got to the point where I have a functional, but buggy, single-threaded site crawler which works a bit like the (perfectly good) W3C Link Checker, but runs in the console.

After bug fixing, I next want to use threading to fetch multiple pages at once, because I rarely get a chance to work with concurrency.

#Rust #RustLang #Crawling #HTML #WebDev

scanning pubmed stores in neo4j cypher queries

neo4j@neo4j> match (n:Researcher)-[:AUTHORED]->(p:Publication) return n.name, collect(p.doi) as pubs limit 100;

syntax a bit verbose but why not

#neo4j #crawling #medicine #research

#Garfield #music #Crawling
#bot #lyrics

Original Garfield comic from November 1, 2019
Text replaced with lyrics from: Crawling

Transcript:
• Distracting,
• Reacting
• Against My Will I Stand Beside My Own Reflection
• It's Haunting
• How I Can't Seem
• To Find Myself Again

--------------
Original Text:
• Arlene: Purrr.
• Garfield: That was a great meal.
• Arlene: Such a large menu.
• Garfield: And such interesting combinations.
• Arlene: We must come back.
• Garfield: My compliments to the dumpster!

The image is a comic strip featuring multiple panels, each depicting different scenes. The panels show various characters like a cat and an orange cat sitting and having conversations. One character is talking about wanting to find themselves, while another character is discussing the concept of being haunted. In the last panel, the cat and the orange cat both appear to be sitting and having conversations with each other. The overall theme of the comic strip is one of contemplation and curiosity, as the characters explore different ideas and emotions.

Felipe’s Friday Forage: Unlock SEO Secrets and Elevate Your Content’s Visibility

Felipe’s Friday Forage explores how search engines rank content through three steps: discovery and crawling, relevance and indexing, and authority and ranking. Websites must ensure easy navigation, clear content, and build trust to enhance visibility. SEO success requires patience and consistent effort, as ranking is a cumulative process, not immediate.

https://dreamspacestudio.net/felipes-friday-forage-unlock-seo-secrets-and-elevate-your-contents-visibility/

#crawling

Client Info