Last issue established problems with Kissu's software and the need to expand into custom written user interfaces, moderation tools and customization of existing aspects.
As of this point kissu's experimental UI is in a beta state(all functionality planned out for the current revision, usable, but buggy and unstable).https://beta.kissu.moe/
The experimental UI's purpose is to bring a 4chanX like environment into kissu providing it with dynamic content that reacts to the user's needs and offers a more responsiveness to the board changes(notifications on site content and your own). A list of features will be set below in the next post.
What we have now is a server with read only permissions on the vichan database to create it's own API pages for the React front-end+server.
While it handles poorly on 800 post threads, this is likely due to poor optimization that I'll fix in the coming weeks. That being said, there is a huge improvement to viewing experience in music threads such as https://beta.kissu.moe/qa/thread/47339#qa-47339
Kissu was also targeted for the first time by script spammers with preexisting knowledge on the workings of Vichan, even going so far as to script spam my banner program with account generation. Though we managed not to cause any interruptions to people's activities on Kissu it required a lot of energy from the staff and required me to get a moderator in addition to Cool. This giving us 22 hour surveillance over the actions happening on the site.
I also felt that time was up for /megu/ which I was uncomfortable about. The new board /ec/ offers a more arguably cultured/artistic board devoted to cute and sexy image sharing.
In the future we will need more moderation tools and I think that the following 3 will put kissu into a state where it's in full control of it's own agenda.
The current three tools that I feel will help kissu and not be too hard to implement are:
- Perceptual hashing: http://blockhash.io/
This will target people's avatars and repetitive image spam. Cloudflare already provides this for child sexual abuse material(and even forwards offending IPs), but for our own purposes this is needed to put more stress on spammers obfuscating their images. In the very least, when combined with a captcha, their spam becomes less recognizable and more comedic when they have to put noise into it. Nonetheless, I think that it's worth pointing out that no captcha or spam gaurd is unsolvable and there needs to be other solutions.
- Archive Backups
The archive on kissu doesn't yet function as a backup because post information gets wiped when it gets knocked off of the pages. Though the archive isn't planned to be a feature of the experimental UI, it is still useful for moderation and restoration of the site after captcha breaking spam attacks.
Still, what if the spammer is completely automated, breaking the captcha, flooding every second, spamming every single thread on the site… then what?
A year ago I made the proxy scrapper program that searches pages for IPs and submits them into the ban table.
This is a solution that catches 1 or 2 ips every so often. But what if it needs to be expanded to eliminate large swaths of IPs and thuroughly deny access from any webserver proxy? Well, it might be worth considering how 4chan handles rangebans. A new function should be added to vichan whenever the system or mod bans an IP address that potentially eliminates subnets when a certain threshold of bans are hit. If the IP scrapper sees that a number of IPs are banned between 192.168.1.0-192.168.1.223 it will make the decision that the 192.168.1.0/24 subnet is unusable due to spam and automaticly issue a ban. Naturally this will cascade upwards to 192.168.0.0/16 when a certain threshold is hit. This combined with filter rules for autobanning will create a system where script spammers are getting consistently shut out and maintain the speed of vichan which starts slowing down when the ban table hits 1000 entries.
In addition to this, it means that mods don't have to put in as much thought into combatting spam making it easier for these active users to contribute to the community and not police it.
Kissu needs an automated backup system. This is not complicated to do for the database, but images will take more effort. I could pay the server provider some money for this, but unless there is a crash, I'll find a manual system, or accept recommendations on this.
These will be done using Vichan since there will be way too much unnecessary wheel reinventing otherwise
I've forgotten to add the API information necessary to trigger inline expansion. It's in a really unstable state right now as well and my priority may be kissu's security for a bit, but I will list the set of features.
- A sidebar now contains most of the important navigational tools and page information
- Banners rotate when clicked, but banner ads work as normal.
- News, boards and partners are included into one object that is dragable
- Summary tab keeps track of all new posts on the site. It also can notify you whenever a post has been made on the site.
- Global and board search allows you to easily find what you're looking for
- Makes use of FireFox's right click menu and additional tabs in chrome
- It's a single page environment, meaning it looks like you're navigating the site but are always on the same page. This is theoretically faster and means that unless you are doing an application layer DDoS I don't have to worry about you clogging things up.
- XSS proof. There is only one instance where html may be inserted dangerously meaning user security is easier to handle.
- You can decide what boards you want to see
- You can create your own custom quote characters
- Mascots still exist
- Custom CSS still exists(and you can create your own shareable custom comment tags)
- Active threads shows you which threads have been last responded to
- Featured threads are admin curated best picks going on in kissu
- Hovering over will generate the full thread, and right clicking allows you to pin
- Partners are given their own dedicated space in recognition of their efforts.
- The overboard allows you to eliminate any board you want
- Allows for paged or catalog viewing
- The overboard is sortable
- The index watcher will now behave like boards and ignore sage.
Threads in general:
- Threads and posts can be completely removed so you don't have to worry about even seeing the stub.
- Threads are also sortable(creation date, bump order, last reply, reply count)
- Threads (will) expand inline
- Index watcher still exists to notify you of new posts
- The fileboard continues to exist now with the new file expansion method
- This is the more disapointing aspect. Currently there's no way for me to effectively render graphs without making the script huge and slow. The text is just colored, which leads to some strange situations.
- Posts can be expanded inline meaning text walls do not dominate threads
- Custom insertion of style tags. Typing [s test]text[/s] will give your text a span with the class of test. This means you can use custom CSS to a greater degree. For security, no transparency is allowed and urls, if not already, will override.
- Yen text and custom quote
- Replies to your posts will notify you when you have a response
- Clicking a cite will now open up a new window that creates a reply chain. This should make it possible to reply to large threads and at a glance tell what the discussion is like.
- Citation hover brings up a preview like always
Files, Embeds, and Flashes:
- Images and gifs expand like always
- Videos(mp4 and webm), audio and flashes open up in a new window that allows you to browse the thread and close them at any time.
- Embeds are given their own window that allows for a larger view than previously.
- Embedded videos are also given thumbnails making it much easier to load pages such as the music thread.
- Hover over a filename to reveal it all
- The reply or create box will follow you around unless you dismiss it(no longer have to worry about pressing back accidentally)
- The comment is under the WhatYouSeeIsWhatYouGet principle, or rich text. If you type in a quote the line of text will turn green. If you type [s custom]text[/s] then the text will be given your custom tag
- Post Previews allow for previewing how a post will look before it is submitted
- URL and Media submissions are combined into one element
- You can modify the filenames of your images by editing them in the given box
- Post submission on the overboard is multi-staged meaning you write up a post, then choose the board then if it's a poll you write up the poll.
- Replies to threads allow for a post queue to be created
List of bugs to resolve
- Fix API to create index expander
- Home Decache
- banned.php can be used to check bans
- /ec/ missing and options
- Circle for taba is wrong
- Handle situations with a moved file
- Hover preview for old style backlinks
- Rules and FAQ need proper files
- … sorting should be changed to bumps only
- Stickies not being handled properly
- Post Queue does not work(crash)
- Mobile should make use of text-areas
- Rich text has various bugs to resolve
- Rich text should handle legacy style formats
- Preview cancellation needs a better system
- Custom style security
- Home removes active threads on delete
- Reply count not being set
- golang sometimes does not write valid JSON(reason undetermined)
- Captcha not bypassing filter
- Capcodes not being set
- Mod icons not in a row
- Page list not in a row
- Spoilers, deletes, generic thumbnails not being used
- Reply notifications not being sent
- QR should handle quotes and thread numbers
- Handle HTML tags for summary and details
- Improve dragging of windowed files
- Article mode has width issues
- Tripcodes missing
- Cursor on ad hover
- Top bar tab not consistent with user choice sometimes
- Summary notifications don't lead to given post
- Summary notifications annoying when multiple tabs open, perhaps refine to board if not on home.
- 404 error not being created on 404 threads
- Whitelist token input
>>4244>- Custom insertion of style tags. Typing [s test]text[/s] will give your text a span with the class of test. This means you can use custom CSS to a greater degree. For security, no transparency is allowed and urls, if not already, will override.
This sounds cool but you might want to add a prefix to the class names, or otherwise people could use class names for markup that clash with the class names used elsewhere, causing Kissu's scripts and/or userscripts to malfunction and maybe even causing malicious things to happen if they're clever enough.
Das a lotta features. Great job dealing with the spam, too.
>Clicking a cite will now open up a new window that creates a reply chain.
Like in 4chanX?
oh, heres one ive been wanting to mention but always forget. My laptop screen is small so if i ever have to solve a captchouli, the textbox to paste the code in as well as the submit button are hidden and i cant scroll down to it so i cant submit the code. ill take a picture of it next time it happens so i can show you
then, because that kept happening, i discovered that if the captchouli comes up i can just close the pop up window and hit the "New Reply" button and the post will submit anyway without me solving the captchouli
That's probably because the flood timer ran out, if it locked you into completing a captcha that would've been bad
I'd like to know what's going wrong. You can flick down to scroll the captcha, but it might not be very clear to do this.
let me see if i can recreate it real quick, im going to spam /trans/ until i get the flood message
/test/ will do as well
sure, let me see if i can recreate it. going to spam until i get the flood message, ill delete it all after
i see, it's not triggering the condensed version
I fixed a CSS rule so it should trigger a smaller version now when you don't have enough height.
>>4249>Like in 4chanX?
A bit different and more along the idea of trying to create microthreads. Instead of making the user manually open every link what I've been working on is the ability to create a chain that automatically does it for you.
This UI is an interpretation of 4chanX's customization and automation features.
First new anti-spam tool is last one I'd ever want to use.
Effectively blocks poster's IP ranges from posting until they submit a ban appeal.
This would probably be used if captcha got broken and someone was using a residential botnet,
or if I died and cool needed to keep the site running.
My preffered system will probably be an ISP check on post submit that locks someone using a webserver ISP, or whatever regex term is needed.
seems like you have to pay money to get this information in a timely manner(ie. not querying someone's website).
ISP detection would have to be done after the post. In this case it seems like a better thing to do after bans on top of an automatic subnet lockout to get better information about weather the IP is from a commercial service or residential area…
decided to make a filter using PHP's gethostbyaddr. It's a hit or miss solution to detecting if an IP is a proxy or not, but it seems to work in certain cases.
I believe it functions on a lookup table, so there might be ways to improve this
I also added a whitelist table which will allow given users to get through filters and rangebans on approval.
My final system of auto-bans is going to be the bans table condenser which will estimate if a range should be locked down or not.
This will be an extension on top of the GoLang API server(Hazuki) to handle scheduled tasks of automatically adding via proxy lists and optimizing the bans table.
algorithm as followshttps://pastebin.com/cbZe1dbk
Algorithm as followshttps://pastebin.com/c0ttK8PS
Need to program in the helper functions and figure out how to work with IP addresses in golang.(I did the factorial loop incorrect)
2 last topics of anti-spam are archive recovery and perceptual hashing
Hey, Vern, can you add in the ability to edit banners after you've posted them? There's been a few times where I've either wanted to update a banner or change the URL but been unable to do so without taking down the previous banner and uploading what's essentially a "new" one.
Archive restore is done, but it's half programmed on hazuki-golang and half on vichan(DB on api server, thread files on vichan) so I'll finish that off then move on to the perceptual hashing question.
After this, i'll leave mod tools until there's another need to improve them, but I think we have just about as much features as 4chan with the exception of better dashboards and management monitoring tools…>>4295
I'll see what it take when working on it comes up again, but it shouldn't be anything more than another option in the user dashboard
there's a function which checks if bans have expired and it seems to be doing that very inefficiently.
If I comment it out kissu can handle 160,000 bans easily, but with it on it struggles to post in under 10 seconds
oh man… it's Twig isn't it… the thing I'm trying to replace is attempting to build the entire bans table https://kissu.moe/bans-all.html
so it freezes posting because it tries to create a json representation of the data. I can't put it onto another thread because php is single threaded… that's messed up
Archive functionality is developed on hazuki-go to serve as a temporary backup. What's remaining on vichan is the index construction and the archive.json generation.
Archive pages are depreciated, but information about them will stay around(json files, html pages and archive as a temporary image storage). They're not worth the continued investment in for functions other than moderation.
The proxy-logger is still making rangeban guesses on the 160,000 bans, but I realized by formula for /17 subnets was too large meaning none will be done as yet… when it's finished and I've moved over the archive stuff I'll run it again to do further compression if any may exist.
The logger can do /9 subnets but this may be overkill, I've yet to see.
This experiment is mostly about seeing if there's a way to confirm posting from risky locations rather than locking out. Kissu's whitelist-anti-ban feature will help here.
When this is moved over all that's left for the anti-spam purposes will be a perceptual hashing solution. Likely this too will be done on Hazuki-Go which will communicate with http://blockhash.io/
or a similar functioning package within golang.
What will follow is more work on the revised UI, see about some banner program alterations(editting banners, perceptual hashing duplicate checks might be good), could even do it as a warmup, make Hazuki-Go less hardcoded and I need to start thinking about money so I'll probably be making a bigger deal of that again.
>>4307>Archive pages are depreciated
Does this mean you're thinking about getting rid of them? I hope not; I like to go back and read the threads I didn't have time to look at before archiving.
zzz there's a problem where if it sometimes writes the json files incorrectly and it means I need to manually fix them. So naturally the property pages for home, all and so on have broken.
you can put that into a json linter to see what I mean. It might probably be a race condition where it writes chunks without locking first.>>4308
I mean by this that they'll exist so long as vichan is still generating HTML pages but I don't intend to put a lot of time into fixing issues around them.
there being a seperation between what happens on the two servers means that occasionally when there's an issue with the API server this sort of thing will happen.
It's easy enough for me to fix… i'll do it in a bit, but things are going to break again anyways so take your time on this feature
Range optimizer and estimator took ~160,000 down to 139,220 bans. Currently /17 isn't in the list, but that wouldn't knock out too many more. There are currently 1163 /25 bans
This operation likely took a total of 20 hours where the ban insertion is a combination of string comparison and golang's CIDR creation from string and associated net.Contains method with an IP and range,
The ban insertion methods are n * m time complexity comparing all bans against inserted bans whereas the optimization method is n * 6n time complexity where it takes a given ban, creates /9 /17 and /25(6 times) variations on it and checks it against all bans for whichever is contained within it. The comparison method also is built to take a string parameter and convert it into an IP every time, this can be improved.
This process of 160,000 * 160,000 causes it to take the time it did(~20hr) and likely there need to be an adjustment to the search method and the comparison algorithm. A binary search would make the complexity roughly n log n, by comparison 160,000 * 5.2, however visualizing this for a byte is a bit more complicated and I want to get back to fixing the UI.
Just the act of sorting might be the fix and tell it to stop when items are no longer contained within the optimization range… In which case the O(n*n) complexity is inevitable, but a bit better. A binary search seems like the best choice here but I'd rather not go there for now unless necessary
These adjustments made it so that the entire process of adding 160k items and compressing them took an hour so with that done, kissu now has archive restore and a good system of removing IPs that are regularly reported to spam/public listings. If there are any mistakes then you can appeal them.
speaking of appeals and stuff I ought to program a moderator input bayesian spam filter because the current weakest link of the system where ips are whitelisted is the appeals form….
It's an old method, but I know how to do it easily… There may be more accurate modern alternatives that don't rely on neural networks.
in vichan there's an option in the deletePost function to not rebuild after a delete, this was turned on accidentally.
> $config['allowed_ext'] = 'webp';
this is strange, I have it enabled but for some reason I'm getting the invalid image error. Was working not long ago
I see, found the problem.>>4326
should work as it has in the previous months
not sure if people would consider this to be too reddit or not but i was trying to think of a way that you could make it easier for people to dump images on /ec/ and came up with an idea that maybe there could be a way to that's tied to the board's unique feature. my idea was that maybe for every 5 likes or whatever they're called that somebody gets they can skip a captcha on /ec/ only. 5 because it's small enough to be easily attainable and large enough to not be that abusable. maybe it's a dumb idea, but it might be nice
Why not just allow multi-file posting?
The UI for multiupload is poor and gives threads with image spam an unfair advantage. Also gives way to dilution of individual post focus and drives interest away from text towards images
There are other more technical issues, but hard to explain
captcha skips are an interesting idea.
Something I could consider is IP hierarchy. If an IP is deemed safe (a temporary entry in the whitelist?) it gets exempt from captchas. This means there would be another page for requesting captcha exemptions.
Concerns with this idea are mostly adminstration
I considered this as well, but it's probably not the best idea. For one not everyone has a static IP so you'd have to keep whitelisting IPs and this may run into issues for people that use VPNs or IPs without captcha that are no longer in use by the people who you meant to whitelist. If somebody with malicious intent were to stumble on one of them it could be very bad. Also, thinking in long term it may become a bit unsustainable with people maybe wondering why others are deemed worthy of this "kissu pass" of sorts and they aren't, and the more the userbase grows the more headaches such a system would cause. For my idea I think it'd probably incentivize using the features on /ec/ a bit more and maybe encourage people to post more if they don't need to solve a bunch of captchas each time they esnt to dump.
hm, I guess it could be that this is a form of IP verification. However, all I need is a phone and I have a free pass anyways. It's much more flimsy because I don't have mod tools built around fixing scores or polls, they exist solely for the sake of entertainment rather than accuracy.
I can't really trust anonymous peer reviews to give out exemptions. I think it has to go through a bureaucracy just as a ban would, with clear cut rules that a paper-pusher could follow…
You could make the exemptions reply-specific, so people can't use them to avoid thread-making captchas
limits to what can be done is a good idea regardless of what system gets used.
Captcha and rangebans are both exemptions that the 4chan pass gives and so far that's the ultimate system of security and freedom. But when you apply this into a 100% free model that I'm going for with kissu that idea falls apart and is too easy to abuse. It's tricky
Noticed that just clicking on the boards up top doesn't show the new posts anymore, seems that you need to refresh to get an updated page
Altered a bunch of mod features, whitelisting, ban appeals can be given deny reasons to allow more flexibility, and also added in embed deleting which wasn't possible before.>>4364
annoying… I turned off some server config settings a while ago so might have caused it, but it could be a variety of things. Turned these settings back on, but I might need to find something else.
Beginning to make changes to logins and passwords
Added a counter to bruteforce on /mod.php? and increased complexity of SQL password in the rare case where the server IP gets exposed and someone begins bruteforcing the sql port.
Speaking of the mod login page, I've noticed that when you delete one of your own posts, you're redirected to that page, and not to the board. Dunno if this has been brought up yet.
uhhh, that's not the case for me.. I winder back on the bord.
can you explain what exactly you do to get this?
Never mind, I just tested it and it redirected me to the board. As recently as a couple of weeks ago, it would redirect you to the mod login page - more specifically, iirc it would take you to a "post deleted successfully" page, and the "return" link on that page led to the mod login page.
seems like that kind of thing only works reliably on UDP sockets
phashing looks good.
The video version is broken(library outdated) so it can only be applied to images and gifs, but still, looks like a good tool
tests from https://kissu.moe/test/res/1078
If you get a VPN banappeal it and I'll add it to the whitelist.
If you appeal it, it doesn't get cleared, and don't get a denial reason then appeal it again because I probably messed up the code
also something's making kissu posting slow again
attempting to get information from rbl.efnet.org was slowing the site down to a terrible pace.
Posting is back to normal speed
getHostByAddr is probably too strong
my whitelist was also having an issue that I'll have to test out
So anon that was submitting appeals should be able to post again
I'm undecided if I'll turn it back on or not, but in any case I'll enter this IP in to the whitelist manually after this is sorted out.
yeah, it was a silly mistake. My bad
Seems to still be happening on /qa/. There's a deleted wojak stuck on the front page for me, because nobody's posted anything to update the board.
anyone can try to delete it at that point and it will go away.
It seems like multi-board pages with an archive have it happen. I'll look into it deeper
2600:1700::/32 phone range is associated with Heyuri bs(wojack and stuff). The log of this subnet is both mostly unused and dubious in usage so ranging it for a bit
I'll be uploading some code,
If things start looking funny that's why
Added perceptual hashing but currently no filter rules are set up. Blockhash works well, spammers will have to put in a lot of effort.
Banners is also having perceptual hashing set up. To stay consistent I need to finish off making test cases for this and do the documentation work for it.
Banners also has a similar anti-bruteforce setup for it.
Also made it easier for mods to see ban appeals so we'll have a faster response time to these kinds of issues and requests for whitelisting.>>4403
I figure if I'm going to be causing problems for VPNs I might as well commit to it.>>4402
I manually added you into the VPN whitelist. Seems in order, but this is still an experimental idea.
so currently I need to:
Prep a banners security update,
Create hashes for existing banners
Begin building up an imagehash blacklist
Create a basic spam filter for appeals
then security issues are resolved.
mp4 and webm posting will be broken for today until I get the experimental video hashing working
looks like mp3 and flac will follow in being busted for now. I'll make it my primary fix
[4/6] Linking build/libstblockhash.a
../videohash.c: In function ‘process_video’:
../videohash.c:129:5: error: unknown type name ‘CvCapture’; did you mean ‘CvCmpFunc’?
../videohash.c:140:15: warning: implicit declaration of function ‘cvCreateFileCapture’; did you mean ‘cvCreateKalman’? [-Wimplicit-function-declaration]
capture = cvCreateFileCapture(filename);
../videohash.c:140:13: warning: assignment makes pointer from integer without a cast [-Wint-conversion]
capture = cvCreateFileCapture(filename);
../videohash.c:152:5: warning: implicit declaration of function ‘cvSetCaptureProperty’; did you mean ‘cvSetWindowProperty’? [-Wimplicit-function-declaration]
cvSetCaptureProperty(capture, CV_CAP_PROP_POS_FRAMES, 0);
../videohash.c:152:35: error: ‘CV_CAP_PROP_POS_FRAMES’ undeclared (first use in this function); did you mean ‘CV_WND_PROP_VISIBLE’?
cvSetCaptureProperty(capture, CV_CAP_PROP_POS_FRAMES, 0);
../videohash.c:152:35: note: each undeclared identifier is reported only once for each function it appears in
../videohash.c:154:12: warning: implicit declaration of function ‘cvQueryFrame’ [-Wimplicit-function-declaration]
while (cvQueryFrame(capture)) frame_count++;
../videohash.c:168:33: warning: initialization makes pointer from integer without a cast [-Wint-conversion]
IplImage* frame_image = cvQueryFrame(capture);
../videohash.c:181:17: warning: implicit declaration of function ‘cvEncodeImage’; did you mean ‘LZWEncodeImage’? [-Wimplicit-function-declaration]
mat = cvEncodeImage(".bmp", frame_image, NULL);
../videohash.c:181:15: warning: assignment makes pointer from integer without a cast [-Wint-conversion]
mat = cvEncodeImage(".bmp", frame_image, NULL);
../videohash.c:233:5: warning: implicit declaration of function ‘cvReleaseCapture’; did you mean ‘cvReleaseData’? [-Wimplicit-function-declaration]
not looking worthwhile
i see, it's the API
in fact, all the images and videos were going through from the looks of it
it's just that vichan's "4chan api" breaks if the hash is wrong
finally, I'm retrialing the idea of duplicate checks. We'll see if it gets in the way of things or not.
There's a big bug in vichan filters where if someone posts at the exact second as someone who violates a filter rule, they'll be effected by it.
This might be some sort of condition where the $_SERVER['REMOTE_ADDR'] field gets mixed up… it's really odd
it might be caused by something else.. it was a really strange situation.
Banners have a security update(account limits per IP, max login attempts per time cycle, blockhashing without hash distance). No new features to that as yet. Registration was previously closed but is open again.
Kissu has distance evaluated blockhash.
You can test out the no-distance blockhash by posting an image and trying to modify it slightly and posting it again. To test out distance evaluated you'd need to trip up a spam filter.
Last item is a simplistic email spam filter for the ban appeals in case it becomes such that someone tries to deny mods from picking out real appeals.
The value of proper appeals is because the proxy searcher and proxy range compressor are starting to catch legitimate VPN users. It might be of worth to also provide a key to donors that lets them auto-whitelisted from rangebans.
After this it's about bugfixing what's been done, from the UI that I haven't touched in a while to the bugs created from new mod tools.
It also goes without saying that since effort was put into vichan, kissu will likely always use vichan's post insertion and mod tools… items such as archive, deletion, polling, scoring and so on will likely eventually be moved over, but vichan's post filters and mod tools are great. I don't see a way around this.
I don't think they are supposed to represent "likes"…
if it looks like a like, and feels like a like, then it's probably a like.
-Deletions restored from archive.
-Seems like an image that was filtered got past hashing check
Also was not proxy
Resolved, it was setting the wrong board for rebuild
I have to start concerning myself with money.
I stalled it out for a long time but I'm paying an affordable rent now and my savings are going to start getting chunked out every month. Is it possible to make a poverty level living on imageboards alone? It's time to put it to the test… I'm pretty certain getting a full-time job will mean quitting imageboards or making it a background project. I've got a commitment towards this site and community that goes beyond myself and want to continue as long as I can afford it. I want to focus on this site as long as possible and keep living a life of flexible hours and see Kissu gain in quality.
I'd like to give some sort of reward to donors and am thinking that it might be ideal to privately give out a whitelist key that exempts you from any vpn/proxy/tor/phone/agent bans and captchas. There are some ways I could do it without a sort of cash down deposit, but this is by far the most practical and my current donors are very generous and I want to give them something special.
This is the donation page should you be interested https://www.patreon.com/ECVerniy
Money is an important priority in this stage of my work here, but it's more important for me to finish off the last security feature and create the whitelist keygen. At this point Kissu is in a state of management, maintenance and bug fixes, where beta features are ironed out(new UI) and fixes/adjustments are made to Kissu's software. However, ultimately I'd like to get away from the software and start focusing on the brand and I'm not too far from this point where I have a software and financial model I want to push. One that doesn't let me compromise my beliefs and stays true to the easy going atmosphere.
You won't have to deal with this kind of stuff right away, but in the next few weeks I'll start making a push. Until then, there are still bugs to solve and features to test.
added IP to whitelist. That ISP provides server hosting and residential IPs
What do you think of using koremutake instead of base64 for displaying secure tripcodes?http://shorl.com/koremutake.php
A major drawback of tripcodes is that nobody remembers them. So to impersonate someone, it's often sufficient to get the first few characters right. Koremutake encodes data as pronounceable strings which are easier to remember, so instead of someone's tripcode looking like !!PiTTj6CgcA, it would look like !!pikevafujojigru. It would only make sense to do this for secure tripcodes because with normal tripcodes it's desired that they look the same on every site.
hmm, it's an interesting idea and it would have to be done before anyone starts using them.
A system of base127 seems like it would be interesting, but could it just be done by appending a vowl to the end of each capital letter constantant
PiTTj6CgcA would become PiiTaTaJu6CagcA. It's a similar system kind of like how in navigation when they read out the codes they say Alpha Beta Charlie Foxtrot isntead of ABCF
>>4453>A system of base127
correction: base128, so all it needs is bit shifts and lookups in the array.
>seems like it would be interesting, but could it just be done by appending a vowl to the end of each capital letter constantant
Wouldn't that still produce unpronounceable tripcodes when there are a lot of lowercase consonants in the tripcode? Looks harder to remember too.
Another option to consider would be words instead of syllables like gfycat uses for its content IDs.
For example, "doppler squalidly phoenician interfaced" using the Niceware wordlist. Might be a bit long, though.
Not sure if people would find it hard to see patterns in that. By contrast something brand new might be overkill. I suppose bitshifting it wouldn't be too hard though and this is a decision that should be made early rather than late, meaning overkill might be a good thing and give growing space.>>4456
too long. I'm also reminded of anonymous sign in on other websites.
Made a calculation mistake there, to have about the name number of bits as current secure tripcodes (6 bits per character*10 characters displayed), it would have to be 9 syllables, e.g.:
Testing some different options for encoding tripcode data. So far the one I like best is Koremutake with capitalization added:https://pastebin.com/3JEw0iA4
If we do change the secure tripcode presentation, it might also be a good idea to consider whether we should update the hash function used.
Are you doing these in PHP? The way that makes sense to me in PHP is a base64_decode which turns it into a 7 character binary string. Characters are converted into an array of 255bit numbers. Each one is shifted 1 bit. It's not accurate to your method so I'm wondering if you had a better way to do it.
07ANLLIBMQ(60bit) into Ó°
,²1 as D3 B0 0D 2C BD 01 31(63bit) and each 8bits gets shifted 1bit resulting in "FROVUDA - GUVYBA - HA"https://pastebin.com/pbh0Ntw7
I don't think it needs to be any more complicated than that, but the layout can be played around with
Yea, that's small. I made it a bit bigger, but I don't want it to be jarring.
Ah, I see, I think I misunderstood the point of the bit shift because I was working with arrays and not straight binary. I need to backshift and modulo by 128 to maintain depth
It feels to be like when a tripcode gets past this length it starts being way too long for people to care about it. I did a quick birthday paradox check and it seems feasibly impossible for someone to pick the same one in a set of 128^7 so I might just run with this unless there's a reason to think otherwise
Is it worthwhile to have a disjarring difference between trip styles? I think it looks ugly to mix and match them…
If you're worried about the length, you might want to consider the capitalization idea, where you use the 8th bit of each byte to decide whether the syllable is capitalized. You could also think about the last option in >>4461
, which is a bit shorter than the Koremutake encoding with some loss to pronounceability due to long strings of vowels. That one is just (in Python) 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'[n//5] + 'aeiou'[n%5] for each byte n.
After looking at some examples, I no longer think the capitalization idea is good:https://pastebin.com/aCrTEpbb
If you're going to do 7 syllables, maybe just for aesthetic purposes it would be a good idea to move the last hyphen between the 5th and 6th syllables so you have SOHEHO - RITO - SEDRO instead of SOHEHO - RITOSE - DRO.>>4471
I can't think of a reason to have insecure trips at all if they're not consistent with other sites.
Titlecase and lowercase seem like the best quick fix. It might be possible to maintain case consistency by adding 128 more pronunciations. I used the list on http://shorl.com/koremutake.php
without any modifications.
> Insecure trip
I suppose so, there's no reason other than perhaps using a solver to get funny combinations. Keeping legacy makes two different styles in one application. I'm not sure if I want this sort of inconsistency.
>>4474>It might be possible to maintain case consistency by adding 128 more pronunciations.
['b', 'd', 'f', 'g', 'h', 'j', 'k', 'l', 'm', 'n', 'p', 'r', 's', 't', 'v', 'w', 'z', 'bl', 'br', 'ch', 'cl', 'cr', 'dr', 'dw', 'fl', 'fr', 'gl', 'gr', 'gw', 'pl', 'pr', 'qu', 'sh', 'sk', 'sl', 'sm', 'sn', 'sp', 'st', 'sw', 'th', 'tr', 'tw']
for the consonant parts and
['a', 'e', 'i', 'o', 'u', 'y']
for the vowel parts would give 43*6 = 258 combinations while maintaining phonetic distinguishability. I'm a bit torn between "gw" and "sv"; neither are very common, with "sv" being the more common one, but I think "gw" rolls off the tongue a bit better.
The 256 set starts looking a lot more foreign
What I'm thinking here is trying to make them sound like names, that way people can remember them easily. Some of these sylables make it seem off, but I suppose this is an improvment over before and the total hashes is now in the quadrillion range like before so I think it's alright.
Debugging my spam filter now
In which case title case works well
Inspired by the name concept, I used the SSA baby name datasethttps://www.ssa.gov/oact/babynames/names.zip
to construct a syllable list that may be a bit more natural. I tried to pick syllables that were commonly used both at the start of names and within names.
['ro', 'be', 'ma', 'li', 'ri', 'sa', 'na', 'la', 'de', 'me', 'vi', 'ra', 'mi', 'cha', 'ni', 'ca', 'le', 're', 'ta', 'da', 'mo', 'co', 'lo', 'so', 'che', 'za', 've', 'ke', 'va', 'ly', 'ge', 'ga', 'te', 'sha', 'se', 'the', 'ce', 'ne', 'sta', 'ste', 'bi', 'no', 'fe', 'bo', 'tri', 'my', 'cy', 'dy', 'lee', 'ci', 'do', 'bra', 'ja', 'lia', 'to', 'ka', 'di', 'ry', 'gla', 'go', 'wa', 'pe', 'xa', 'lea', 'ha', 'dia', 'gai', 'mia', 'we', 'si', 'ze', 'ba', 'gi', 'ree', 'phi', 'ti', 'ko', 'die', 'lei', 'ty', 'su', 'by', 'sie', 'lai', 'dwi', 'sca', 'pri', 'tha', 'rey', 'roy', 'tu', 'brie', 'sy', 'ny', 'kay', 'que', 'cia', 'cie', 'wi', 'bri', 'rya', 'sco', 'lio', 'xi', 'ki', 'lu', 'sto', 'nia', 'tie', 'vo', 'gu', 'fa', 'she', 'dua', 'hi', 'ria', 'tho', 'tia', 'nya', 'spe', 'pa', 'ley', 'vio', 'qua', 'loi', 'qui', 'bia', 'sey', 'chi', 'lay', 'nee', 'ru', 'dra', 'via', 'mu', 'teo', 'fre', 'miya', 'shia', 'kie', 'kia', 'saa', 'sea', 'sia', 'cly', 'he', 'maya', 'rai', 'sue', 'jo', 'mee', 'riya', 'mya', 'nie', 'shi', 'wre', 'niya', 'bree', 'sho', 'xo', 'rio', 'fu', 'dee', 'vie', 'dwa', 'cla', 'thea', 'tra', 'ble', 'shau', 'kai', 'lie', 'lya', 'naya', 'laya', 'ney', 'sla', 'blo', 'nai', 'po', 'way', 'smi', 'nu', 'xe', 'fi', 'kae', 'kee', 'phe', 'rae', 'jua', 'nea', 'mai', 'zia', 'kiya', 'liya', 'fo', 'zi', 'rye', 'kei', 'lle', 'nei', 'mie', 'cho', 'bria', 'tya', 'nay', 'nye', 'rea', 'toya', 'pha', 'ky', 'deo', 'hu', 'cle', 'tzi', 'lou', 'brey', 'dre', 'thia', 'khi', 'leya', 'llu', 'raya', 'cu', 'fae', 'mou', 'ho', 'cre', 'loy', 'sai', 'mae', 'rie', 'tre', 'cae', 'nyo', 'zu', 'leo', 'ceo', 'shee', 'pi', 'ray', 'thie', 'maa', 'shu', 'veo', 'bu', 'naa', 'xy', 'loui', 'lau', 'gra', 'sti', 'vy', 'xia', 'je', 'dio']
The code I used is athttps://pastebin.com/7b6Qm9uU
if you want to tweak it.
Also, the code STI wrote years ago to generate "secure" tripcodes is pretty bad. There's very little point in using a semi-modern hashing algorithm if you're just going to use it to generate a salt for ancient DES crypt(). Wouldn't surprise me if someone could crack the "secure" tripcodes just by enumerating over all possible 12-bit salts. You should rip that garbage out and replace it with a proper key derivation function.
Another thing that might make it look nicer is adding consonants to the ends of the words. If you're doing two words, you could end one word with consonants[n>>4] and another with consonants[n&15] where n is one of the bytes in the binary string being encoded. consonants could be something like
['', 'b', 'c', 'd', 'f', 'g', 'h', 'k', 'l', 'm', 'n', 'r', 's', 't', 'x', 'z']
where I've excluded q, j, v, p, and w for being less common than other consonants at the end of words in the SSA baby names list, and the first item in the array is an empty string so that sometimes it ends in a vowel.
Came up with some more syllable list candidates:https://pastebin.com/GTJ0NL5z
syl1 is the same as >>4483
syl2 also requires that the syllables be common in words in Moby Dick. This removes a lot of foreign-sounding syllables.https://pastebin.com/03BtXPt1
Moby Dick from: https://www.gutenberg.org/files/15/15-0.txt
syl3 first enumerates over all consonant-vowel combinations except those starting with Q (Y considered a vowel). Then three-letter syllables are taken from the SSA baby names list. The result is slightly shorter syllables. I think it might look a bit better.https://pastebin.com/3LepHNqM
syl4 is like syl3 but with the Moby Dick filter applied, and with syllables starting with X also excluded from the two-letter combinations.https://pastebin.com/Amk7ETgf
Here's some sample names generated from each syllable list. I've also applied the final consonant idea from >>4484
From what I'm getting, the strength of crypt is based on installation settings, but I'll double check this
>PHP sets a constant named CRYPT_SALT_LENGTH which indicates the longest valid salt allowed by the available hashes.>CRYPT_SHA512 - SHA-512 hash with a sixteen character salt prefixed with $6$.
But it could be interpreted that the salt has to be prefixed with $6$ to force it to pick sha512 as default
I see, so the autogenerated salts for vichan are in of sha length, but none of the salts for the crypto functions are prefixed with $6$rounds=1000$ to make it not default to a DES hash
well, they didn't make that mistake for the passwords so I guess trips was just a decade long oversight
I see, GPU cracking
I suppose moving data from RAM/Cache to CPU to GPU is going to take more time for more data so that the bottleneck becomes the transfer speeds rather than the GPU speed.
>Bcrypt isn't known to be much gpu friendly, the primary reason being the ridiculous amount of memory being used by each bcrypt hash. To make the matter worse the memory access is pseudo-random which makes it very difficult to cache the data into faster memory. With this we are left with two choices:
>1.Use the slow and large global memory and spend more time fetching the operands than processing them.
>2.Use the fast but small LDS(64KB) memory and severly limit the number of concurrent threads.
seems like the logins should be redone more than tripcodes
vichan has an easy system for changing hashes so they're using blowfish now.
some comments from IRC:
< sorry, didn't make myself clear, it's divided into four sections; which section looks like it has the more natural and memorable names?
<Anonymous> <syl1:Crenyaxiam SyrobohSuexothok BumiyatisLeebraneer NaisemaacKaesiadr - Pastebin.com> @ kissu.moe
<Anonymous> sy14 for me, but it's hard to see a difference
<Anonymous> way too much data…
<Anonymous> Crenyaxiam Syroboh
<Anonymous> Suexothok Bumiyatis
<Anonymous> Leebraneer Naisemaac
<Anonymous> Kaesiadreg Thaxagair
<Anonymous> Jedybil Tredeemeer
<Anonymous> Tribouwot Queshescoc
<Anonymous> Guquiesiez Theojouslyg
<Anonymous> Ledeaphah Reescotryc
<Anonymous> Neasaispe Roofutiam
<Anonymous> Scheleebral Traclesheen
<Anonymous> Chireechal Woseymon
<Anonymous> Raeglalex Bratenyz
<Anonymous> Thinieseex Lyawrevun
<Anonymous> Pyletiec Frehacek
<Anonymous> Radrezer Nyoleystim
<Anonymous> Taizyzun Jeepydeak
<Anonymous> Smakizus Fonemaut
<Anonymous> Bijeescoh Getyneg
<Anonymous> Niwoophox Philaytes
<Anonymous> Gaysituh Midraquir
<Anonymous> How could you miss the new line before "syl4"? If you're going to do it, at least be consistent.
<Anonymous> the one after 4 isn't a \r\n
<Anonymous> just a \n
<Anonymous> 4 seems arabic, 3 seems latin
<Anonymous> 2 doesn't really make sesne
<Anonymous> 1 kind of reminds me of biblical names
<Anonymous> i think 3
accepted an appeal
New tripcode system made to immitate names(set3 in https://pastebin.com/GTJ0NL5z
Ban list won't display the automatic vpn/proxy/tor bans.
Ban appeals can be detected for spam. Currently only detecting and not filtering
Update hashing of logins and secure hashes
Insecure tripcode solver(no point to having insecure hashes at all if there's nothing to do with them, why not have people break them to create names)
If you don't want to do the final consonant thing in >>4484
it might be good to add some syllables with consonants at the end. I generated a new list where I'm allowing some consonants at the end of syllables. Consonants allowed at the end of syllables are not allowed to precede another consonant at the beginning of a syllable to prevent two inputs from mapping to the same string. I've also dropped some of the frequent consonants from the two-letter syllables; it seems to improve the fraction of names that can be generated. Currently 34.3% of the female baby names on the SSA list (weighted by frequency) can be generated from combinations of these syllables and a smaller fraction of the male names. I'm probably going to fiddle with this more to see if I can improve coverage.
['ba', 'be', 'bi', 'bo', 'bu', 'by', 'ca', 'ce', 'ci', 'co', 'cu', 'cy', 'da', 'de', 'di', 'do', 'du', 'dy', 'fa', 'fe', 'fi', 'fo', 'fu', 'fy', 'ga', 'ge', 'gi', 'go', 'gu', 'gy', 'ha', 'he', 'hi', 'ho', 'hu', 'hy', 'ja', 'je', 'ji', 'jo', 'ju', 'jy', 'ka', 'ke', 'ki', 'ko', 'ku', 'ky', 'la', 'le', 'li', 'lo', 'lu', 'ly', 'ma', 'me', 'mi', 'mo', 'mu', 'my', 'na', 'ne', 'ni', 'no', 'nu', 'ny', 'pa', 'pe', 'pi', 'po', 'pu', 'py', 'ra', 're', 'ri', 'ro', 'ru', 'ry', 'sa', 'se', 'si', 'so', 'su', 'sy', 'ta', 'te', 'ti', 'to', 'tu', 'ty', 'va', 've', 'vi', 'vo', 'vu', 'vy', 'har', 'ter', 'tho', 'ber', 'don', 'san', 'sha', 'den', 'lin', 'dan', 'lyn', 'the', 'bra', 'tri', 'ral', 'ric', 'nor', 'vin', 'lan', 'ver', 'lee', 'dia', 'ron', 'fre', 'lea', 'nel', 'lor', 'lia', 'gla', 'ger', 'phi', 'sta', 'ran', 'mer', 'min', 'han', 'war', 'mia', 'lon', 'lau', 'lei', 'fer', 'der', 'lar', 'mar', 'dal', 'ker', 'ken', 'per', 'nan', 'del', 'lay', 'wen', 'nic', 'bec', 'roy', 'mon', 'ree', 'die', 'kay', 'jan', 'bel', 'car', 'mil', 'ben', 'dre', 'bri', 'sco', 'tia', 'son', 'col', 'ste', 'man', 'lai', 'van', 'wan', 'tha', 'vel', 'rey', 'len', 'gio', 'tru', 'win', 'mai', 'mel', 'for', 'loi', 'bur', 'kie', 'gil', 'zel', 'dra', 'sue', 'rai', 'dol', 'woo', 'dee', 'gia', 'leo', 'cia', 'kia', 'lio', 'dor', 'sel', 'loy', 'cin', 'shi', 'xan', 'her', 'tra', 'key', 'nia', 'tor', 'mya', 'zan', 'mal', 'rol', 'von', 'wel', 'til', 'ray', 'kai', 'kel', 'ten', 'gen', 'way', 'nai', 'nya', 'she', 'pal', 'kee', 'gri', 'fan', 'ley', 'cie', 'rae', 'ton', 'kei', 'hai', 'tan', 'ren', 'sey', 'hel', 'qua', 'lil', 'sto', 'sie', 'kae', 'kin', 'tae', 'lou', 'can', 'sea', 'sol', 'bil', 'hen', 'hea', 'nee', 'vio', 'dwi']
some sample names:
it's beggining to become more than anyone could ask for in a tripcode on an anonymous website, but I can simply modulo the number by vowels+1 and it won't take me any extra time.
did what you suggested in 4484
Do you mean by the number of consonants? The idea of >>4484
was to take one of the bytes and instead of looking it up in the syllable table, compute n/16 and n%16 and use that to look up consonants to end the names with (with one option to let it end on a vowel). Wouldn't be much extra work. Switching to a syllable table with final consonants like >>4497
would work too, albeit producing slightly longer output (about 1 character longer).
A noteworthy consideration is that male names tend to end in consonants while female names tend to end in vowels. Thus >>4484
would be heavily biased toward male-like names whereas >>4497
would be biased toward female-like names. At least in theory.
Running some numbers:
probability of randomly generating real names from the SSA lists
where the numbers are probability (first word) * probability (second word) = probability (both words)
syl3 without >>4484
2.29804e-07 * 5.81503e-04 = 1.33631e-10 (female)
4.16767e-08 * 1.055e-04 = 4.396e-12 (male)
syl3 with >>4484
4.15631e-05 * 4.15631e-05 = 1.72749e-09 (female)
1.03712e-05 * 1.03712e-05 = 1.07562e-10 (male)
final consonants included in the syllables >>4497
2.21422e-07 * 5.669e-04 = 1.255e-10 (female)
4.21423e-08 * 1.31249e-04 = 5.53113e-12 (male)
Hmm… by these numbers, they're all female biased in a way, and the main thing affecting the numbers is the word length.
Another interesting measure is what percentage of names can be generated if you're using a tripcode cracker.
If we want a whole word match, syl3 without >>4484
can get to fewer names because it requires the name end with a vowel.
Below, the percent of names reachable by syl3 for various numbers of syllables (out of all names of the given gender, weighted by name frequency):
1 syllable: 0.7% of female names, 0.7% of male names
2 syllables: 14.9% of female names, 4.8% of male names
3 syllables: 9.2% of female names, 1.2% of male names
4 syllables: 1.9% of female names, 0.0% of male names
5+ syllables: 0.0% of female names, 0.0% of male names
total (any number of syllables): 26.6% of female names, 6.8% of male names
If we want a partial match, syl3 without >>4484
should perform better than above because a lot of names ending in a consonant will be found as substrings even though the whole words end in a vowel. I haven't calculated those numbers yet, though.
Here are the same numbers for the alternatives:
syl3 with >>4484
1 syllable: 1.4% of female names, 3.5% of male names
2 syllables: 22.4% of female names, 22.7% of male names
3 syllables: 11.3% of female names, 3.2% of male names
4 syllables: 1.9% of female names, 0.1% of male names
5+ syllables: 0.0% of female names, 0.0% of male names
total (any number of syllables): 37.0% of female names, 29.5% of male names
final consonants included in the syllables >>4497
1 syllable: 0.6% of female names, 1.0% of male names
2 syllables: 21.7% of female names, 12.2% of male names
3 syllables: 10.0% of female names, 2.3% of male names
4 syllables: 1.9% of female names, 0.1% of male names
5+ syllables: 0.0% of female names, 0.0% of male names
total (any number of syllables): 34.3% of female names, 15.6% of male names
Why is it still 7 syllables?
a mistake. >>4502
This is too much tripcode… these modifications make it much better than what anyone else uses. The combinations don't have to be masculine or feminine, they just have to look somewhat appealing and recognizable without the use of a solver
Is this still the list from >>4475
? I thought you changed it to syl3.
If you implemented >>4484
it should be 6 syllables not counting the final consonants. >>4484
is supposed to make the tripcodes shorter.
good catch, I mixed up the backup and upload
I'm happy with how they look now. It's achieved the goal of making tripcodes less like random characters and into sylables that are pronounceable and somewhat pleasing to the eye. There are other tasks to build on that getting stuck on a feature no one uses as yet would be kind of wasted time.
It's also had the sideeffect of increasing hash security for both tripcodes and mod logins
As it is now the final consonants are just making the tripcode longer but not carrying any additional information. What I was suggesting with the consonant idea is that you can drop one of the syllables and instead use that byte n to make the two consonants with n/16 and n%16.
Are you pushing these changes to Github? I know you don't want to waste any more time (and I probably shouldn't either), but if I could see what you've implemented I could just send you a patch or pull request with the fix.
I see. Yeah that could be good.>>4512
I'll make another push into a seperate repo because this version of kissu is practically pointless without the golang server. I just have to check there's nothing important in any of the files
yeah it looks a bit nicer
uploading the single file is fasterhttps://github.com/ECHibiki/Kissu-Files