Jump to content

Recommended Posts

Posted

I don't know if anyone knows this, but I have a little bit of knowlege in python and html. I was wondering if I could find a way to access a gallery of pictures with just one word describing them and with links. I need something like tihe "requests" python API but it puts the name into its database. I will than use numpy to store the pictures through links and convert them with this tool. Okay, now to the part Where You get to know what I am actually doing. I am creating a reverse stable diffusion software that turns images (whether they are ai generated or not) into prompts. Now, I am going through that proccess, and than I will use stable diffusion to merge the picture with a randomly generated image of every single word in the dictionary (other than the inapropriate stuff) than, based on how much it matches, based on percentage, it will add that word into the library of words that it will use. After it finds enough matches, It will give you those words, and with a grammar api, It will make it make sense and not something like "boy, pig, fish" you get what I am saying. I just need some help to finifsh this.

Posted

What kind of help do you need?

 

How to retrieve HTML from a specific URL?

Try this: https://www.tutorialspoint.com/downloading-files-from-web-using-python

 

How to analyze HTML and extract links from it? You need to find the src img tags. string.index(), string.find() or regex can be used to do it.

https://www.w3schools.com/python/python_regex.asp

 

5 hours ago, grayson said:

I was wondering if I could find a way to access a gallery of pictures with just one word describing them and with links.

'Dictionary' contains key-value pairs.

https://docs.python.org/3/tutorial/datastructures.html#dictionaries

You can make abstract datatype in custom class with 'keyword', 'url' and 'path' (on local storage), to have more 'values'.

 

 

Posted
2 hours ago, Sensei said:

What kind of help do you need?

 

How to retrieve HTML from a specific URL?

Try this: https://www.tutorialspoint.com/downloading-files-from-web-using-python

 

How to analyze HTML and extract links from it? You need to find the src img tags. string.index(), string.find() or regex can be used to do it.

https://www.w3schools.com/python/python_regex.asp

 

'Dictionary' contains key-value pairs.

https://docs.python.org/3/tutorial/datastructures.html#dictionaries

You can make abstract datatype in custom class with 'keyword', 'url' and 'path' (on local storage), to have more 'values'.

 

 

I just need a stable diffusion tutorial

Posted

and how to fix this:

Traceback (most recent call last):
  File "my directory", line 18, in <module>
    image = Image.open(img_url)
            ^^^^^^^^^^^^^^^^^^^
  File "my directory", line 3218, in open
    fp = builtins.open(filename, "rb")
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: [Errno 22] Invalid argument: my directory

of course, I cant show my actual directories

Posted (edited)
20 minutes ago, grayson said:

and how to fix this:

Traceback (most recent call last):
  File "my directory", line 18, in <module>
    image = Image.open(img_url)
            ^^^^^^^^^^^^^^^^^^^

Try this:

import requests
from PIL import Image
# python2.x, use this instead  
# from StringIO import StringIO
# for python3.x,
from io import StringIO

r = requests.get('https://example.com/image.jpg')
i = Image.open(StringIO(r.content))

 

or

 

from PIL import Image
import requests

img = Image.open(requests.get('http://example.com/image.jpg', stream = True).raw)
img.save('image.jpg')

 

Edited by Sensei
Posted
5 minutes ago, Sensei said:

Try this:

import requests
from PIL import Image
# python2.x, use this instead  
# from StringIO import StringIO
# for python3.x,
from io import StringIO

r = requests.get('https://example.com/image.jpg')
i = Image.open(StringIO(r.content))

 

or

 

from PIL import Image
import requests

img = Image.open(requests.get('http://example.com/image.jpg', stream = True).raw)
img.save('image.jpg')

 

Well, I am also using beautifull soup. I will show you the code so far:

import requests
from bs4 import BeautifulSoup
import numpy as np
import os
import openai
import nltk
from nltk.tokenize import word_tokenize
from nltk.corpus import stopwords
from nltk.stem import PorterStemmer
from PIL import Image

print('type your img url in caps!')
img_url = input()
library = requests.get("https://pixabay.com/")
soup = BeautifulSoup(library.content, 'html.parser')
the_good_stuff = soup.content
Junk = soup.find_all('img')
image = Image.open(img_url)
image_metadata = Junk.info
print(image_metadata)

 

Posted

Okay, well I need a way to simultaniously research every word in the dictionary. And If I am using pytorch, Can I just put the url in? how does it train off the images (I will e using pixabay)

Posted

In the above code you showed, the image_url is taken from the user input, but you must use the content returned by BeautifulSoup.

https://www.google.com/search?q=scraping+images+python+beautifulsoup

Tutorial from the first link:

import requests 
from bs4 import BeautifulSoup 
    
def getdata(url): 
    r = requests.get(url) 
    return r.text 
    
htmldata = getdata("https://www.geeksforgeeks.org/") 
soup = BeautifulSoup(htmldata, 'html.parser') 
for item in soup.find_all('img'):
    print(item['src'])

The src in the above is a relative or absolute URL. You need to convert it to an absolute URL and use it in requests.get() (or alternatives), then output from it in Image.open() to retrieve the image. Then use image object where you need to.

Posted (edited)
11 hours ago, grayson said:

I am creating a reverse stable diffusion software that turns images (whether they are ai generated or not) into prompts.

Just curious, are you creating something like CLIP interrogator?
(The CLIP Interrogator is a tool to optimize text prompts to match a given image)

4 hours ago, grayson said:

I just need a stable diffusion tutorial

With some more understanding of your goals I may be able to share some tips on this

Edited by Ghideon
Posted
1 hour ago, Sensei said:

Tutorial from the first link:

import requests 
from bs4 import BeautifulSoup 
    
def getdata(url): 
    r = requests.get(url) 
    return r.text 
    
htmldata = getdata("https://www.geeksforgeeks.org/") 
soup = BeautifulSoup(htmldata, 'html.parser') 
for item in soup.find_all('img'):
    print(item['src'])

I tried this code on my website, which I knew had relative URLs, and confirmed. item['src'] is relative.

In fact, the conversion from relative to absolute URLs can be done using the requests module itself:

for item in soup.find_all('img'):
    src=requests.compat.urljoin(url, item['src'])
    print(src)

... but there is a possible problem - a web page may have a <base> tag to replace the original URL.... A rarely used thing these days.

https://www.w3schools.com/tags/tag_base.asp

( I found a serious mistake in Python requests module - it does not accept URL to local file either in local directory nor with file:// .. it will be harder to debug the code.. )

 

Posted
41 minutes ago, Ghideon said:

Just curious, are you creating something like CLIP interrogator?
(The CLIP Interrogator is a tool to optimize text prompts to match a given image)

With some more understanding of your goals I may be able to share some tips on this

Yh, kinda like that. But it is designed to put conjunctions and stuff after the tag. It finds the most optimal keywords to get similar results with.

Posted
11 minutes ago, grayson said:

from requests
from bs4 import BeautifulSoup

Goofy ahh invalid syntax 😑

It should be:

import requests

Posted

oh okay. Anyways, all I need to know is how to use beautiful soup and tensorflow together to sort out words. I also need a database with every  (non-innapopriate) word in the dictionary. All it needs to do is let you be able to define every word at once. If you need to know why, just read the main post

Posted
import requests
import bs4 as BeautifulSoup
import tensorflow as ts
import numpy as np
import matplotlib as plt
from keras.models import Sequential
from keras.layers import Dense
webiste = requests.get('https://pixabay.com/')
soup = BeautifulSoup('img', "html.parser")
percentage =(75)

def data_dictionary (
aardvark = soup.find_all('aardvark'),
abacus = soup.find_all('abacus'),
abalone = soup.find_all('abalone'),
ablaze = soup.find_all('ablaze'),
a_bomb = soup.find_all('atomic bomb'),
abomination = soup.find_all('abomination'),
abstract = soup.find_all('abstract'),
acid = soup.find_all('acid'),
acorn = soup.find_all('acorn'),
acoustic_guitar = soup.find_all('acoustic guitar'),
acrobat = soup.find_all('acrobat'),
actor = soup.find_all('actor')
):

    model = Sequential()

model.add(Dense(units=64, activation='relu', input_dim=8))  # Input layer with 8 input features
model.add(Dense(units=32, activation='relu'))               # Hidden layer
model.add(Dense(units=1, activation='sigmoid'))             # Output layer
model.compile(optimizer=data_dictionary, loss='binary_crossentropy', metrics=['accuracy'])


print("IGNORE THIS MESSAGE")
print(percentage)

I haven't seen anything but syntax errors in who knows how long

also, yes, I am manually typing every word in the dictionary

 

Posted
56 minutes ago, grayson said:

I also need a database with every  (non-innapopriate) word in the dictionary. All it needs to do is let you be able to define every word at once. If you need to know why, just read the main post

Before digging into technical aspects; don't you need some context to tell what's appropriate and what is the definition? Quick example: 
nut: usually large hard-shelled seed
nut: a small usually square or hexagonal metal block with internal screw thread

(yes, there are more homonyms; some of which may be inappropriate depending on context)

Posted
14 minutes ago, Ghideon said:

Before digging into technical aspects; don't you need some context to tell what's appropriate and what is the definition? Quick example: 
nut: usually large hard-shelled seed
nut: a small usually square or hexagonal metal block with internal screw thread

(yes, there are more homonyms; some of which may be inappropriate depending on context)

I just need every English word put into one variable. Than with beautifulsoup I can look it up. Also, can you find why I am having syntax errors? I am relatively new to coding. Not saying I can't pull this off tho

Posted

You have syntax errors because you have no idea what you are doing.

The interpreter/compiler gives you the line number with the error. Use this knowledge to fix the errors.

You need to experiment with less demanding projects to learn how to use all these libraries and features before you move on to an advanced project like this one.

32 minutes ago, grayson said:
soup = BeautifulSoup('img', "html.parser")

For example, here you must have an error, because what on earth is an 'img'.... ?

 

Posted
1 minute ago, Sensei said:

You have syntax errors because you have no idea what you are doing.

The interpreter/compiler gives you the line number with the error. Use this knowledge to fix the errors.

You need to experiment with less demanding projects to learn how to use all these libraries and features before you move on to an advanced project like this one.

no, I know what I am using. I called a module and It came out with a module error. I dont understand

Posted
1 minute ago, grayson said:

I just need every English word put into one variable

google suggests:
A list with 10000 words, maybe useful as a starting point: https://www.mit.edu/~ecprice/wordlist.10000
A larger list (466k words): https://github.com/dwyl/english-words

Notes:
-verify licensing before using
-"inappropriate" is for you to define and handle
-You need a lot more than just English words (se my note above) to get going with your project
 

Posted
Just now, grayson said:

no, I know what I am using. I called a module and It came out with a module error. I dont understand

You should start by reading the documentation provided by the original author of the library.

Then you should find a tutorial on how to use the library in question, with examples.

Then you should write your own experimental code to test it in a well-defined and constrained environment.

Once you have mastered it, use it in a real project..

 

Instead you use functions from libraries you just heard about in pretty advanced project. "go for broke"..

 

Posted
14 minutes ago, Sensei said:

You should start by reading the documentation provided by the original author of the library.

Then you should find a tutorial on how to use the library in question, with examples.

Then you should write your own experimental code to test it in a well-defined and constrained environment.

Once you have mastered it, use it in a real project..

 

Instead you use functions from libraries you just heard about in pretty advanced project. "go for broke"..

 

Here is what it says "module object is not callable" I never wrote a sequence of letters or anything that has to do with 'module'

 

BeautifulSoup("img", "html.parser")

 

Posted
2 minutes ago, Ghideon said:

Another note @grayson: large scale processing of someone else's content may be profited unless you have an explicit permission. 

Well, I guess I will ask Mit than. (though nobody responds to my emails)

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.