Moves 💩 ideas to BADIDEAS

This commit is contained in:
Nemo 2020-09-05 13:53:27 +05:30
parent c40cf7af03
commit 7d92c7b498
2 changed files with 104 additions and 101 deletions

102
BADIDEAS.md Normal file
View File

@ -0,0 +1,102 @@
# :poop:
Not all ideas are great. These are things I thought might work at one point, but no longer consider worth building. I don't remove such ideas from the repo, because I think all ideas are worth learning from.
## OpenBook 💩
_Edit_: Lots of marketing companies already have built this, but not in the creepy way that I'd have liked.
This is a privacy-awareness application that relies on the TrueCaller data-sharing model. It is meant to teach users that their privacy is not in their hands, but in the hands of those that they trust.
The idea itself is to let people browse facebook as someone else. This is done by note of the following points:
1. Facebook allows OAuth API access that allow your application to perform tasks as any user whose token you have.
2. To start using the service, you must "hand in your data" first. Just like TrueCaller uploads _your entire phonebook_ before you can use the app, openbook requires you to give us access to _your facebook account_ before you can use openbook.
### Expectations
This has massive privacy implications. People assume their facebook data (such as information on your about page) to be available to only people they have friended on facebook. However, it is also accessible to all applications you authorize, which is what we are exploiting basically. I expect the application to be banned and have its keys revoked within a day or two. If I were to make this, I'd rather not use my own facebook account for fear of getting it suspended.
### Features
Since the idea is to teach users about privacy, openbook will not ask for access to all data. It will probably use a harmless data-property that will essentially be made public to all users of the app. Currently, I'm thinking "No. of Friends", which is not always available to public if you've hidden your friend list. It is harmless enough to not cause any mayhem, and yet vital enough to just show that such _attacks are possible_.
### Interface
1. Homepage will be a simple FAQ on privacy-related matters and what the app does.
2. A login page
3. A search interface to search amongst all users using the app.
4. A profile page of any user, where you can see their data using someone else's token.
### Backend
Unlike facebook, which can directly access _any data_ on their servers, we are limited to using their API. Which means getting data for any user has now become a mapping problem. You need to know which tokens can access data of which user. This could be made possible by pre-mapping the users accessible via each token, and refreshing each token's list every week or so (as you make new friends).
This idea probably breaks lots of point in Facebook's ToS, but that doesn't mean it can't be built.
## 💩 iStalk
This is essentially a unified profile mechanism, where a user's identity is defined by all of their activity on various networks. While this has some cool sub-ideas (like correlating activity between various networks), the most important implication that arises is that it can be a perfect tool for stalking. However, you can easily add in consent from the original profile owner to clear that concern.
### Idea
iStalk lets you "follow" a person across various networks that they belong to. Some networks, like twitter are mostly public, and can be followed very easily. But other networks (like Facebook) are harder to follow by virtue of them requiring a "connection" between the two users. iStalk makes it easier for you by using your account on that network as a proxy-connect mechanism.
Basically, you give us your access tokens, and we use them to stalk someone for you.
### Interface
A profile creation page allows you to specify as much information as you have on the person. This includes their facebook, twitter, last.fm, github usernames for eg. Each integration you want has to first be verified by you granting us (iStalk) access to your account (via OAuth).
Once a profile has been created, we will continuously long-poll the service to fetch new information as and when it becomes available. Real-time notifications are delivered to you as the person's activity is tracked.
## 💩Lettersafe
My notes for lettersafe are on [workflowy](https://workflowy.com/s/5439f7a9-3762-f247-3e96-4d047b5d4ce0).
The idea was to build a zero-knowledge email storage. Kinda like lavabit, with a few minor changes. After thinking about it for months, and working on it for a few days, I gave up on the idea. I understand now that server-side email encryption can never be a good idea.
## 💩 nofollow enforcer
### Introduction
Read [What is rel=nofollow](https://support.google.com/webmasters/answer/96569?hl=en) if you don't know what that is.
The above link has the following specific snippet:
> If you want to recognize and reward trustworthy contributors, you could decide to automatically or manually remove the nofollow attribute on links posted by members or users who have consistently made high-quality contributions over time.
However, this is something that is not even attempted by the majority of large user-generated content websites such as medium, quora or even wikipedia. This is an attempt to fix this problem. [Wikipedia](https://meta.wikimedia.org/wiki/Nofollow) had a lot of discussion on this topic before enabling nofollow on all external links in 2007.
### Need
- rel=nofollow saves you from spam
- it improves the quality of content
- it reduces the quality of backlinks
We need a reliable way that balances these two approaches. Something that:
- detects low quality link spam and marks those links as nofollow
- detects high quality links as good and let those be followed by search engines
For the links that fall in the middle, ie the ones we aren't so sure about we can either play safe and mark them as nofollow or specify a threshold score that must be matched before we unmark them.
### Solution
- StackOverflow is one of the few websites that has even attempted to solve [this issue](http://meta.stackexchange.com/questions/111279/remove-nofollow-on-links-deemed-reputable). Their solution involves a lot of metadata they already have about the link (such as answer/comment score, votes, age etc) and they use that to mark a link as reputed.
Their approach is closed, so as to dissuade spammers from understanding it and working around it.
The general idea is to build a machine learning system that takes in a piece of content along with some optional metadata (such as authorship information) and then uses it to mark each link in the content as reputed/nofollow.
### Notes
_These are free-flow notes about the problem/solution/idea._
- Training Data: Since so many sites already have heaps of user generated content, we can use their data (stackoverflow, wikipedia provide dumps) to get good quality links.
- Reputed Sites: A very quick way to get off the ground would be to mark sources as reputed, such as wikipedia, new york times etc.
- Accuracy: As a very basic check, we could use StackOverflow dataset (since it is partially nofollow) to check our own accuracy against their implementation.
- Metadata: Will have to consider some metadata (such as authorship) along with the content itself, so as to improve spam accuracy. For instance, if the same user is posting too many links in a short time, we might mark future attempts as spam.
- We could use akismet partially as a heuristic for partial content.
- The idea is to build an API quite similar to akismet that can be embedded as plugins in content sites like wordpress and wikis.
- We should also take care to actually check the link itself. For instance, the original link might point to some dummy site, but redirect to a spam website. Google would follow the redirect while crawling, and land up on the spam site if we are not careful.

103
README.md
View File

@ -5,12 +5,13 @@
**Legend**
- ✨ - Favorite Ideas. I really like this, and you could possible build a company of some of these.
- 💩 - Not all ideas are great. These are things I thought might work at one point, but no longer consider worth building. I don't remove such ideas from the repo, because I think all ideas are worth learning from.
- 🎁 - I'd love to help you build this, consider this like a bounty. You'll get a personal postcard from me for building something on this list.
- 🚀 Someone (might be me) built this!
- 🚧 There is a work-in-progress implementation.
- 👩‍🔬 Research ideas
The :poop: ideas (I thought might work at one point, but no longer consider worth building) are at [BADIDEAS.md](BADIDEAS.md).
<!-- START doctoc generated TOC please keep comment here to allow auto update -->
<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
**Table of Contents**
@ -69,54 +70,6 @@ This is a open repository of personal ideas. Some of these are based on personal
The emojis are just indicative. Make something people want is the YC motto, but sometimes you must make something for no good reason other than "just because".
## OpenBook 💩
_Edit_: Lots of marketing companies already have built this, but not in the creepy way that I'd have liked.
This is a privacy-awareness application that relies on the TrueCaller data-sharing model. It is meant to teach users that their privacy is not in their hands, but in the hands of those that they trust.
The idea itself is to let people browse facebook as someone else. This is done by note of the following points:
1. Facebook allows OAuth API access that allow your application to perform tasks as any user whose token you have.
2. To start using the service, you must "hand in your data" first. Just like TrueCaller uploads _your entire phonebook_ before you can use the app, openbook requires you to give us access to _your facebook account_ before you can use openbook.
### Expectations
This has massive privacy implications. People assume their facebook data (such as information on your about page) to be available to only people they have friended on facebook. However, it is also accessible to all applications you authorize, which is what we are exploiting basically. I expect the application to be banned and have its keys revoked within a day or two. If I were to make this, I'd rather not use my own facebook account for fear of getting it suspended.
### Features
Since the idea is to teach users about privacy, openbook will not ask for access to all data. It will probably use a harmless data-property that will essentially be made public to all users of the app. Currently, I'm thinking "No. of Friends", which is not always available to public if you've hidden your friend list. It is harmless enough to not cause any mayhem, and yet vital enough to just show that such _attacks are possible_.
### Interface
1. Homepage will be a simple FAQ on privacy-related matters and what the app does.
2. A login page
3. A search interface to search amongst all users using the app.
4. A profile page of any user, where you can see their data using someone else's token.
### Backend
Unlike facebook, which can directly access _any data_ on their servers, we are limited to using their API. Which means getting data for any user has now become a mapping problem. You need to know which tokens can access data of which user. This could be made possible by pre-mapping the users accessible via each token, and refreshing each token's list every week or so (as you make new friends).
This idea probably breaks lots of point in Facebook's ToS, but that doesn't mean it can't be built.
## 💩 iStalk
This is essentially a unified profile mechanism, where a user's identity is defined by all of their activity on various networks. While this has some cool sub-ideas (like correlating activity between various networks), the most important implication that arises is that it can be a perfect tool for stalking. However, you can easily add in consent from the original profile owner to clear that concern.
### Idea
iStalk lets you "follow" a person across various networks that they belong to. Some networks, like twitter are mostly public, and can be followed very easily. But other networks (like Facebook) are harder to follow by virtue of them requiring a "connection" between the two users. iStalk makes it easier for you by using your account on that network as a proxy-connect mechanism.
Basically, you give us your access tokens, and we use them to stalk someone for you.
### Interface
A profile creation page allows you to specify as much information as you have on the person. This includes their facebook, twitter, last.fm, github usernames for eg. Each integration you want has to first be verified by you granting us (iStalk) access to your account (via OAuth).
Once a profile has been created, we will continuously long-poll the service to fetch new information as and when it becomes available. Real-time notifications are delivered to you as the person's activity is tracked.
## ✨🎁 Collaborative Bookmarking
There are a dozen bookmarking services out there, many of them quite well done. However, most services are focused on the idea that bookmarking is a lone-person habit, which someone does in isolation.
@ -151,12 +104,6 @@ A recommendation engine built on top of my facebook data is a good idea, I think
Workflowy is a cool tool that I use for note-taking. It allows infinitely nested lists with @mention and #hashtag support. One thing it lacks currently is API for me to access my own data. I think workflowy is a great tool that could become a lot better if there were a way for developers to hook into it. (For example using workflowy as a data-backend for a todo-app).
## 💩Lettersafe
My notes for lettersafe are on [workflowy](https://workflowy.com/s/5439f7a9-3762-f247-3e96-4d047b5d4ce0).
The idea was to build a zero-knowledge email storage. Kinda like lavabit, with a few minor changes. After thinking about it for months, and working on it for a few days, I gave up on the idea. I understand now that server-side email encryption can never be a good idea.
## Email on top of keybase
Keybase has a cool API. I wonder if its possible to build an actual email service on top of keybase?
@ -560,52 +507,6 @@ A slightly stale version of this data is available at https://www.bescom.org/upo
Credits: https://twitter.com/kingslyj/status/1219697117909803008
## 💩 nofollow enforcer
### Introduction
Read [What is rel=nofollow](https://support.google.com/webmasters/answer/96569?hl=en) if you don't know what that is.
The above link has the following specific snippet:
> If you want to recognize and reward trustworthy contributors, you could decide to automatically or manually remove the nofollow attribute on links posted by members or users who have consistently made high-quality contributions over time.
However, this is something that is not even attempted by the majority of large user-generated content websites such as medium, quora or even wikipedia. This is an attempt to fix this problem. [Wikipedia](https://meta.wikimedia.org/wiki/Nofollow) had a lot of discussion on this topic before enabling nofollow on all external links in 2007.
### Need
- rel=nofollow saves you from spam
- it improves the quality of content
- it reduces the quality of backlinks
We need a reliable way that balances these two approaches. Something that:
- detects low quality link spam and marks those links as nofollow
- detects high quality links as good and let those be followed by search engines
For the links that fall in the middle, ie the ones we aren't so sure about we can either play safe and mark them as nofollow or specify a threshold score that must be matched before we unmark them.
### Solution
- StackOverflow is one of the few websites that has even attempted to solve [this issue](http://meta.stackexchange.com/questions/111279/remove-nofollow-on-links-deemed-reputable). Their solution involves a lot of metadata they already have about the link (such as answer/comment score, votes, age etc) and they use that to mark a link as reputed.
Their approach is closed, so as to dissuade spammers from understanding it and working around it.
The general idea is to build a machine learning system that takes in a piece of content along with some optional metadata (such as authorship information) and then uses it to mark each link in the content as reputed/nofollow.
### Notes
_These are free-flow notes about the problem/solution/idea._
- Training Data: Since so many sites already have heaps of user generated content, we can use their data (stackoverflow, wikipedia provide dumps) to get good quality links.
- Reputed Sites: A very quick way to get off the ground would be to mark sources as reputed, such as wikipedia, new york times etc.
- Accuracy: As a very basic check, we could use StackOverflow dataset (since it is partially nofollow) to check our own accuracy against their implementation.
- Metadata: Will have to consider some metadata (such as authorship) along with the content itself, so as to improve spam accuracy. For instance, if the same user is posting too many links in a short time, we might mark future attempts as spam.
- We could use akismet partially as a heuristic for partial content.
- The idea is to build an API quite similar to akismet that can be embedded as plugins in content sites like wordpress and wikis.
- We should also take care to actually check the link itself. For instance, the original link might point to some dummy site, but redirect to a spam website. Google would follow the redirect while crawling, and land up on the spam site if we are not careful.
## 🎁 Mars: Terraform Remote HTTP Backend with End-to-End encryption
A fork of <https://www.terraform.io/docs/backends/types/http.html>, which changes the configuration format to: